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GUS Gene Constructs 



Detailed In This Book 
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Osbourn and Wilson, Chapter 1 0, Figure 1 Map of SP6 transcription plasmid pJII 140 
(Sleat et a/., 1987). pJII140 was derived from pSP64 (Promega Corp.) via pSP64TMV 
(Sleat et al. t 1986), which contained a BamHl fragment bearing the TMV origin-of- 
assembly sequence (OAS; genome coordinates 5118-5550). The GUS gene contains no 
common restriction sites (Jefferson, 1987). £g/II-linearized p Jill 40 was transcribed as 
described in the text prior to incubation with TMV coat protein. 
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synthase; Bin 19, binary vector (Bevan, 1984). 
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GUS 
Specific Activity 
(nmoles/min/mg) 



CHO 



GUS 



AUG 
GpppG 1 -^]— 



GUS 



UGA 



AUG 



GUS 



UGA 



AUG 

GUS-JAJso^ GpppG'^f- 



GUS 



UGA 



I AUiA^GUUtA^UUlKT 



1.4 



20.0 



30.0 



Tobacco 



0.01 



0.34 



0.71 



Gallie et aL, Chapter 13, Figure 1 GUS constructs used for clectroporatioa of CHO 
and tobacco cells.. Approximately 1 fig of each construct was used for electroporation. 
See text for details. 

Farrell and Beachy, Chapter 9 GUS constructs for protein targeting studies. See 
pGUSN358 -> S in Clontech list below. 



Commercially Available 



Many of the following plasmids are used in this book, and all are 
available from Clontech laboratories. Alternatively, plasmids can be 
obtained by writing to Dr. R. A. Jefferson (see Appendix A) or to the 
authors of the appropriate work. 




3BI121, 
; GUS, 
opaline 
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HNOS4»i 
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pBI101 (Jefferson ef a/., 1987) Designed for testing promoter activity, pBHOl confers 
kanamycin resistance. pBHOl is a derivative of pBIN19 and is unstable unless grown in 
the presence of kanamycin. 



pB 11 01 .2 and pB1 1 01 .3 Identical to pBI 101 , except the reading frames are moved one 
and two nucleotides, respectively, relative to the polylinker. 
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pBI121 (Jefferson ef a/., 1987) A derivative of pBHOl containing the 35S promoter of 
the cauliflower mosaic virus. 
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pBI221 The CaMV 35S promoter-GUS-NOS-ter portion of pBI121 was cloned into 
pUC19 to produce pBI221 . 
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PRAJ255 (Jefferson et al., 1986) A 1.87 Kb insert containing the GUS gene was 
cloned into pEMBL9 to produce pRAJ255. 



it 




p RAJ 260 Similar to pRAJ255 except for a modified Eco Rl site. 
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PRAJ275 This derivative of pRAJ255 contains a consensus translation^ initiator in 
place of deleted 5* GUS sequences. 
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pGUSN358->S (Farrell and Beachy, Chapter 9, this book) pGUSN358-*S contains a 
GUS gene modified by site directed mutagenesis to eliminate a glycosylation site. This 
allows processing by the endoplasmic reticulum without the usual inactivation of GUS 
activity. 
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Abstract 

The subcellular location of a protein is an important characteristic with functional implications, and hence the problem of 
predicting subcellular localization from the amino acid sequence has received a fair amount of attention from the 
bioinformatics community. This review attempts to summarize the present state of the art in the field. © 2001 Elsevier 
Science B.V. All rights reserved. 
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1. Introduction 

The general problem to predict the subcellular lo- 
cation of a protein from its aniino acid sequence has 
long been a central one in bioinformatics. To date, 
three conceptually different approaches have been 
proposed: to look for the targeting signals that the 
cell uses as 'address labels', to base the prediction on 
the observation that proteins from different cellular 
compartments tend to differ in subtle ways in their 
overall amino acid composition, and to use evolu- 
tionary relationships (based on the endosymbiotic 
origin of organelles) to infer the subcellular localiza- 
tion. There are even one or two 'meta-methods' in 
which outputs from a range of 'primary' prediction/ 
analysis methods are combined in an optimal way. 
Each approach has its strengths and weaknesses, and 
since no across-the-board benchmarking tests have 
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been performed, it is not yet possible to make a fair 
comparison between all the different methods pro- 
posed by different authors. 

In this review, we have chosen to first discuss the 
most commonly used methods for predicting individ- 
ual subcellular localizations - the secretory pathway, 
mitochondria, chloroplasts, and the nucleus - and 
then describe a couple of attempts to construct inte- 
grated predictors that try to 'sort' proteins between 
multiple compartments. The reader is also referred to 
a recent (and somewhat more ambitious) review by 
Nakai [1] for further details. 



2. Prediction of signal peptides for secretion 

N-Terrninal signal peptides target proteins to the 
secretory pathway in eukaryotic cells, and for trans- 
location across the cytoplasmic membrane in bacte- 
ria. It has long been known that they have a tripar- 
tite design with a short positively charged amino- 
terminal segment (n-region), a central hydrophobic 
segment (c-region), and a more polar C-tenninal seg- 
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ment that is recognized by the signal peptidase en- 
zyme. The first methods to identify signal peptide 
sequences were published already in the mid 1980s 
[2-4], but the currently most widely used method is 
the neural network-based SignalP predictor [5]. Sig- 
nalP combines two different neural networks: one 
that is trained to discriminate between residues that 
belong and do not belong to a signal peptide (the S- 
score), and one that is trained only to recognize sig- 
nal peptidase cleavage sites (the C-score). The cleav- 
age site is predicted by multiplying together the C- 
score and the negative 'derivative' of the S-score (this 
serves to focus the prediction on the region where the 
S-score changes from high to low), while the discrim- 
ination between proteins that have and do not have a 
signal peptide is based on the mean S-score evaluated 
from the N-terminus to the predicted cleavage site. 
The current version of Signal? was trained on three 
different signal peptide data sets - one with eukary- 
otic signal peptides, one with signal peptides from 
Gram-negative bacteria, and one from Gram-positive 
bacteria ~ and hence is to some extent optimized for 
different organisms. 

SignalP-HMM is a new version of SignalP that is 
based on a hidden Markov model formalism [6]. This 
predictor was developed in order to improve the dis- 
crimination between signal peptides and N-terminal 
transmembrane anchor segments, but is in other re- 
spects comparable to the original SignalP predictor. 

According to a recent benchmarking study [7], Sig- 
nalP and SignalP-HMM perform equally well when 
it comes to discriminating between proteins with and 
without signal peptides, although the neural network 
version seems to be slightly better in predicting signal 
peptidase cleavage sites, Table 1. SignalP-HMM is 
however clearly superior for discriminating between 
cleavable signal peptides and N-terminal anchors. 
The two SignalP versions clearly outperformed the 
other programs tested, and thus seem to be the 
best signal peptide predictors available at the mo- 
ment. 

It should be mentioned that Chou recently re- 
ported a method similar in spirit to a weight-matrix 
method but including statistics on pairwise correla- 
tions between the positions closest to the signal pep- 
tidase cleavage site [8]; however, this method was not 
included in the benchmarking study. 



Table 1 



Performance of the two versions of SignalP: hidden Markov 
model version (HMM) and neural network version (NN) 



SingalP 
version 


Cleavage site 
location, % correct 


Discrimination, 


MCC 






Euk G- G+ 


SP/non-SP 




SP/SA 






Euk G- 


G+ 


Euk 


HMM 
NN 


69.5 81.4 64.5 
72.4 83.4 67.5 


0.94 0.93 
0.97 0.89 


0.96 
0.96 


0.74 
0.39 



The cleavage site location is measured in the percentage cor- 
rectly assigned cleavage sites. The discrimination is measured in 
Mathews* correlation coefficient (MCQ which is one (1) for a 
perfect prediction and zero (0) for a totally random assignment 
[33]. The discrimination is given both between signal peptide- 
containing (SP) and signal peptide-lacking (non-SP) proteins 
and between secreted proteins (SP) and proteins anchored in 
the membrane (SA). The table is adapted from [6\. 

3. Prediction of mitochondrial targeting peptides 

Mitochondrial targeting peptides are enriched in 
positively charged residues (Arg in particular), lack 
negatively charged residues, and have the ability to 
form amphiphilic a-helices [9]. The amphiphilic 
structure is important for binding to receptors in 
the outer mitochondrial membrane [10,11], and the 
net positive charge may be needed during the A*F- 
driven import across the inner mitochondrial mem- 
brane [12], 

Three popular methods for predicting mitochon- 
drial targeting peptides are TargetP [13], MitoProt 
[14], and Predotar (see Table 3). Both Predotar and 
TargetP are neural network predictors and are con- 
ceptually similar to SignalP. They are not clear-cut 
single location predictors since they also deal with 
other presequences; both investigates chloroplast 
transit peptide presence and in addition to this Tar- 
getP handles signal peptide prediction. Predotar is 
essentially aimed for plant sequences. The perform- 
ance of Predotar and TargetP is discussed in Section 
6 and summarized in Table 2. 

MitoProt predicts localization of a protein by cal- 
culating a number of physicochemical parameters 
from its amino acid sequence, and then computing 
a linear discriminant function (LDF) which is com- 
pared to a cutoff for mitochondrial/non-mitochon- 
drial localization prediction. Both MitoProt and Tar- 
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Table 2 

Comparison of localization prediction for three multi-category predictors 



Predictor Location Plant set Non-plant set 







% Correct 


Sensitivity 


Specificity 


% Correct 


Sensitivity 


Specificity 


TargetP 


Chloro. 


85.3 


0.85 


0.69 


90.0 








Mito. 




0.82 


0.90 




0.80 


0.67 




Seer. 




0.91 


0.95 




0.96 


0.92 


PSORT 


Chloro. 


69.8 


0.47 


0.69 


82.5 








Mito. 




0.66 


0.87 




0.81 


0.60 




Seer. 




0.82 


0.74 




0.64 


0.93 


Predotar 


Chloro. 


84.8 


0.82 


0.77 


76.3 








Mito. 




0.86 


0.87 




0.86 


0.50 




Seer. 




(0.80) 


n/a 




(0.65) 


n/a 



The plant data set contains 940 proteins (from Swiss-Prot release 36) and the non-plant set contains 2738 proteins (from Swiss-Prot 
release 37) with annotated localization. Note that Predotar is not intended for use on non-plant set, hence its partly poor performance 
on this set. Sensitivity is the fraction of true positive predictions relative to the set of proteins known to be localized in respective 
compartment. Specificity is the fraction of true positive predictions relative to the set of proteins predicted to respective compartment. 
'Percent correct* refers to the fraction of all proteins in a set for which the correct location is predicted. 



getP suggest a potential cleavage site of the predicted 
mitochondrial targeting peptides. 

Predotar, TargetP and MitoProt only predict N- 
terminal mitochondrial targeting sequences, and no 
method exists that will identify import signals present 
elsewhere in the protein, although such signals are 
known to exist [15,16]. 

4. Prediction of chloroplast transit peptides 

N-Terminal chloroplast transit peptides have 
highly variable lengths, contain very few negatively 
charged residues, and are highly enriched for hy- 
droxylated amino acids. Two neural-network based 
predictors are available: ChloroP [17] and Predotar 
(see Table 3). ChloroP also includes a separate mod- 
ule (based on a weight matrix) for predicting the 



transit peptide cleavage site. A comparison of local- 
ization prediction performance between Predotar and 
TargetP (of which ChloroP is a part) using 940 plant 
sequences from Swiss-Prot can be found in Table 2. 

Many thylakoid proteins have composite targeting 
signals with a typical transit peptide followed by a 
thylakoid targeting signal. The latter is usually very 
similar to the signal peptides found on secretory pro- 
teins, and can be identified by SignalP or SignalP- 
HMM (our unpublished data). A specialized weight 
matrix for predicting the cleavage site is available for 
thylakoid signal peptides [18]. 



5. Prediction of nuclear localization signals 

Nuclear localization signals are composed of one 
(monopartite) or a pair of (bipartite) short positively 



Table 3 




Web addresses of predictors 




Predictor 


Web address (URL) 


ChloroP 


http://www.cbs.dtu.dk/services/ChloroP/ 


MitoProt 


http://www.mips.biochem.mpg.de/cgi-bin/proj/rnedgen/mitofilter 


predictNLS 


http ://maple.bioc.columbia.edu/predictNLS/ 


Predotar 


http : //www . inra.fr/Internet/Produits/Predotar/ 


PSORT 


http://psort.nibb.ac.jp/ 


SignalP 


http://www.cbs.dtu.dk/services/SignalP/ 


TargetP 


http ://www.cbs.dtu.dk/services/TargetP/ 


TMHMM 


http ://www.cbs.dtu.dk/services/TMHMM/ 
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charged stretches in the protein chain. The mono- 
partite nuclear localization signal has a short consen- 
sus sequence, K(K/R)X(K/R), and binds to a pocket 
on the surface of the importin a receptor [19]. In 
bipartite nuclear localization signals, the monopartite 
motif is combined with a second small cluster of 
basic residues, 10-12 residues N-terminal to the first. 

The basic clusters can be found anywhere within 
the protein chain, and are exposed on the surface of 
the folded protein. Since the entire chain has to be 
searched for nuclear localization signals, it is difficult 
to avoid false positive predictions. The best predictor 
available at the moment is based on a large collection 
of mono- and bipartite motifs [20]. It is capable of 
finding 43% of the known nuclear proteins with no 
false positive predictions on the set of Swiss-Prot 
entries (release 38) with unambiguously annotated 
localization. This is achieved through collection of 
known NLSs and their homologues, and applying 
an 'in silico mutagenesis' to extend the motifs as 
far as possible without matching any non-nuclear 
proteins. 

6. Integrated methods for predicting subcellular 
localization 

In these days of whole-genome sequencing, what is 
obviously needed are integrated prediction methods 
that somehow represent the entire protein sorting 
potential of the cell and assign the most likely sub- 
cellular localization to a protein based on its amino 
acid sequence. This also includes sorting within an 
organelle or a pathway: between, e.g., the mitochon- 
drial outer membrane, intermembrane space, inner 
membrane, and matrix, or between the different com- 
partments along the secretory pathway. In eukary- 
otic cells, the number of distinct compartments is 
thus very large. 

The pioneering work in this area is due to Nakai 
and Kanehisa [21,22]. His PSORT program now dis- 
tinguishes between 17 different subcellular localiza- 
tions (10 for a newer, retrained version called 
PSORT 2 that uses a slightly different decision algo- 
rithm), and integrates a number of pre-existing pre- 
diction programs as well as calculated characteristics 
such as overall amino acid composition within a uni- 
fied framework [23,24]. Drawid and Gerstein [25] 



have recently presented a system that is similar in 
spirit to PSORT but uses a different formalism 
(Bayesian statistics) for integrating multiple kinds 
of information (everything from SignalP predictions 
to microarray expression profiles). The method was 
applied to the full Saccharomyces cerevisiae pro- 
teome, and thus provides estimates of the fraction 
of all yeast proteins found in different compartments. 
A predictor based only on overall amino acid com- 
position and pairwise residue correlations has been 
developed by Chou [26]. 

The TargetP predictor [13] has a more limited 
scope than PSORT, and only differentiates between 
secretory proteins, mitochondrial proteins, chloro- 
plast proteins, and everything else. The method 
looks for N-terminal sorting signals by feeding the 
outputs from SignalP, ChloroP, and an analogous 
mitochondrial predictor (not available as a stand- 
alone predictor) into a 'decision neural network' 
that makes the final choice between the different 
compartments. Although not yet integrated into 
TargetP, membrane proteins can be predicted with 
high reliability by programs such as TMHMM 
[27,28]. TargetP predicts signal peptides with high 
sensitivity and specificity but performs less well on 
mitochondrial targeting peptides and chloroplast 
transit peptides, Table 2. Modules for predicting 
cleavage sites in the different targeting signals are 
also included in TargetP; again, performance is 
much better on the signal peptides than on the other 
two classes of peptides. 

Predotar is primarily aimed at predicting the chlo- 
roplast/mitochondrion sorting problem (thus dealing 
with plant sequences), and can also predict dual lo- 
calization - both chloroplastic and mitochondrial - 
which is an existing reality for some proteins [29]. 
The level of overall prediction accuracy is around 
85% on a plant test set, the same as for TargetP, 
Table 2. The two predictors differ however somewhat 
in their performance on the subsets and trying both 
predictors on sequences of interest could prove use- 
ful. 

Finally, an interesting approach to subcellular lo- 
calization prediction has been presented by Eisen- 
berg and co-workers [30]. They use a protein's c phy- 
logenetic profile' (i.e., a list of the presence or 
absence of orthologs to the query protein in all fully 
sequenced genomes) to predict its localization, 
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based on the assumption that the endosymbiont 
origin of different compartments will be reflected 
in the phylogenetic profiles of their respective pro- 
teomes. Thus, mitochondrial proteins (even the nu- 
clearly encoded ones) will be most highly related to 
proteins from bacteria such as Rickettsia prowasekii 
[31], whereas chloroplast proteins will be most 
highly related to those found in photosynthetic bac- 
teria. 

Unfortunately, the different methods discussed in 
this section have not been evaluated together using a 
common benchmark (since the different methods do 
not distinguish between the same set of compart- 
ments, such an evaluation is not trivial). TargetP 
has the conceptual advantage that it tries to identify 
biologically well-characterized sorting signals and 
hence allows a certain amount of 'critical evaluation 
by eye' after the prediction has been made. The phy- 
logenetic-profile approach also has a clear biological 
foundation, and again a human user may critically 
evaluate the results (i.e., the list of orthologs) against 
his or her biological knowledge. The purely statistical 
methods are at a disadvantage in this respect since 
they are based on sequence characteristics that are 
not easily evaluated by eye and, insofar as they in- 
corporate amino acid composition measures, only 
correlate with subcellular localization indirectly 
(e.g., as a result of surface-exposed residues being 
adapted to a low-pH environment [32]). 

7. Conclusions 

The complex compartmentalization of a biological 
cell cannot yet be accurately captured by bioinfor- 
matics. For compartments where the sorting signals 
can to a good approximation be regarded as short 
stretches of amino acids with little interaction with 
the rest of the protein, the sequence analysis tools 
now available do a decent job. In cases where the 
sorting signals are presented in the context of a 
folded protein, however, they are very difficult to 
identify and one often has to resort to purely statis- 
tical approaches (amino acid composition) or meth- 
ods based on sequence similarity. With improved 
fold recognition and three-dimensional structure pre- 
diction algorithms, it may eventually become possi- 
ble both to detect these more complex sorting signals 



and to predict the location of a protein based on its 
general surface characteristics. In any event, the pre- 
diction of subcellular protein localization will most 
likely remain an important problem area for bioin- 
formatics for some time to come. 
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Sorting of Phaseolin to the Vacuole Is Saturable and Requires 
a Short C-Terminal Peptide 

Lorenzo Frigerio, a ' b Maddalena de Virgilio, 3 - 1 Alessandra Prada, a Franco Faoro, c and Alessandro Vitale 3 - 2 

a lstituto Biosintesi Vegetali, Consiglio Nazionale delle Ricerche, via Bassini 15, 20133 Milan, Italy 
"Department of Biological Sciences, University of Warwick, Coventry CV4 7AL, United Kingdom 
c Centro Miglioramento Sanitario Colture Agrarie, Consiglio Nazionale delle Ricerche, via Celoria 2, 20133 Milan, Italy . 

Phaseolin, one of the major legume proteins for human nutrition, is a trimeric glycoprotein of the 7S class that accumu- 
lates in the protein storage vacuoles of common bean. Phaseolin is cotranslationally introduced into the lumen of the 
endoplasmic reticulum; from there, it is transported through the Golgi complex to the storage vacuoles. Phaseolin is 
also transported to the vacuole in vegetative tissues of transgenic plants. By transient and permanent expression in to- 
bacco leaf cells, we show here that vacuolar sorting off phaseolin is saturable and that saturation leads to Golgi- 
mediated secretion from the cell. A mutated phaseolin, in which the four C-terminal residues (Ala, Phe, Val, and Tyr) 
were deleted, efficiently formed trimers but was secreted entirely outside of the cells in transgenic tobacco leaves, in- 
dicating that the deleted sequence contains information necessary for interactions with the saturable vacuolar sorting 
machinery. In the apoplast, the secreted phaseolin remained intact; this is similar to what occurs to wild-type phaseolin 
in bean storage vacuoles, whereas in vegetative vacuoles of transgenic plants, the storage protein is fragmented. 



INTRODUCTION 

Phaseolin is the major storage protein of common bean. 
Phaseolin is a member of the 7S vicilin class and one of the 
most important legume proteins for human nutrition; a num- 
ber of efforts have been made to improve its nutritional 
value (Hoffman et al., 1988; Dyer et al., 1995). The structure, 
genetic makeup, cotranslational and post-translational mod- 
ifications, and intracellular transport of phaseolin have been 
elucidated largely by numerous investigators (Bollini et al., 
1982; Slightom et al., 1985;. Sturm et al., 1987; Lawrence et 
al., 1994), but the mechanisms that allow correct intracellu- 
lar targeting of phaseolin and the other 7S storage proteins 
have not been fully characterized. 

Phaseolin is a homotrimeric soluble protein that accumu- 
lates in the protein storage vacuoles of cotyledonary cells. 
Its synthesis, maturation, and intracellular targeting are me- 
diated by the secretory pathway, which delivers proteins 
into the endoplasmic reticulum (ER) and from there to the 
cell surface or the vacuoles (Okita and Rogers, 1996). The 
Golgi complex as well as other intermediate compartments 
mediate this traffic. 

Protein constructs that have a transient signal peptide for 
cotranslational insertion into the ER, but no other specific 
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sorting signal, are secreted from plant cells (Denecke et al., 
1990; Hunt and Chrispeels, 1991). Soluble proteins destined 
for the different vacuoles are sorted from the proteins des- 
tined for the apoplast probably at the exit of the Golgi com- 
plex (Ahmed et al., 1997; Paris et al., 1997). Sorting occurs 
because vacuolar proteins have structures, often identified 
as short stretches of amino acids in propeptides, that are 
not present in apoplastic proteins. The signals are variable, and 
different vacuolar sorting mechanisms must exist (Matsuoka 
et al., 1995; Kirsch et al., 1996). A putative integral mem- 
brane receptor that recognizes some but not all of these 
stretches in vitro has been identified (Kirsch et al., 1996; 
Ahmed et al., 1997). 

By expressing different phaseolin constructs, we show 
here that although phaseolin synthesized in cells of trans- 
genic tobacco leaves is exclusively vacuolar, vacuolar tar- 
geting of this protein can be saturated at high transient 
expression levels in the same cell types, resulting in the 
Golgi-mediated secretion of the protein. We also show that 
removal of the four C-terminal residues (Ala, Phe, Val, and 
Tyr) allows correct trimer formation but causes phaseolin to 
be entirely secreted outside of the cells in transgenic to- 
bacco leaves and that this C-terminal sequence is most 
probably exposed on the surface of wild -type phaseolin tri- 
mers. Wild-type phaseolin is fragmented in vegetative 
vacuoles of transgenic plants (Murai et al., 1983; Sengupta- 
Gopalan et al., 1985; Bagga et al., 1992; Pedrazzini et al., 
1997), whereas the mutated construct secreted into the 
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apoplast is not subjected to fragmentation. Our results are 
consistent with the presence of a saturable vacuolar sorting 
mechanism that recognizes a discrete signal located at the 
C terminus of phaseoGn. In transgenic tobacco leaves that 
express wad-type phaseolin, this mechanism operates close 
to its saturation limit 



RESULTS 



Assembly-Competent Phaseolin Is Partially Secreted 
from Transiently Transformed Protoplasts 

This study was performed using the mutated phaseolin T343F 
and other mutants constructed in a T343F background. Fig- 
ure 1 shows the sequence of wild-type p-phaseolin and of 
the mutated constructs used in this work (from amino acid 
341 to the C terminus); we have not introduced any mutation 
in the preceding sequence. In T343F, we inactivated the 
second of the two N-glycosylation sites of a wild-type 
p-phaseolin sequence; this mutated phaseolin is therefore 
glycosylated with a single glycan (Pedrazzini et al. t 1997). 
The mutation has no negative effect on the assembly and in- 
tracellular transport of phaseolin (Pedrazzini et al., 1997), 
and indeed in wild-type phaseolin, the second glycosylation 
site is a "weak" site, which is glycosylated only in some of 
the molecules. The advantage of using T343F rather than 
wild -type phaseolin resides in the fact that modifications of 
the single glycan by enzymes located in the Golgi complex 
can be used to trace the intracellular transport of the protein 
(Sturm etal., 1987). 

While examining the intracellular transport of mutated 
phaseolin polypeptides, which we produced to study the re- 
lationships between trimerization and intracellular traffic, we 
observed that assembly-competent polypeptides, but not 
assembly-defective ones, could be detected in part in the 
incubation medium of transiently transformed tobacco pro- 
toplasts. This is shown in Figure 2A. Tobacco mesophyll 
protoplasts were transiently transformed with plasmid with- 
out inserts, as a control, or with plasmid carrying the se- 



quence encoding T343F or T343FA360 (hereafter referred 
to as A360). The latter is an assembly-defective deletion mu- 
tant of phaseolin (Figure 1 ; Pedrazzini et al., 1 997). Protoplasts 
were subjected to pulse-chase labefing with ^S-labeled 
methionine and cysteine. Finally, we analyzed phaseolin by 
SDS-PAGE and fluorography after immunoprecipftation 
from cell homogenates or incubation media. 

The polypeptides detected when the control plasmid 
without inserts was used represent nonspecific contami- 
nants recognized by the antiserum. Of these, the major im- 
munoprecipitable contaminant is a polypeptide of 40 kD 
(Figure 2A, lanes 1 and 2); the synthesis of this polypeptide 
must be induced largely by the stress imposed by transient 
transformation, because there was almost no material at 40 
kD when immunoprecipitations were performed using un- 
treated protoplasts from transgenic leaves (Figure 2B). 

After a 1-hr pulse, T343F synthesized during transient ex- 
pression was detectable as an abundant band of 46 kD and 
less abundant fragments of 20 to 25 kD (Figure 2A, lane 3). 
After a 5-hr chase, a relevant proportion of intact phaseolin 
was converted into the smaller fragments (Figure 2A, cf. 
lanes 3 and 4). Post-translational fragmentation is a charac- 
teristic of phaseolin expressed in heterologous plants 
(Bagga et al., 1 992), although in bean cotyledons, the stor- 
age protein does not undergo such fragmentation. Frag- 
mentation is a result of transport to the heterologous 
vacuoles (Pedrazzini et al., 1997). A relevant amount of 
T343F phaseolin was also recovered unfragmented from the 
incubation medium, where it accumulated during the chase 
(Figure 2A, lanes 9 and 10). After a 5-hr chase, secreted 
T343F represented ~25% of the amount synthesized during 
the pulse and almost no intracellular intact phaseolin re- 
mained, suggesting that the rest had been targeted to the 
vacuole. The overall amount of T343F recovered after the 
chase in this and similar experiments was lower than that re- 
covered after the pulse, most probably because of full deg- 
radation of a proportion of phaseolin in the vacuole. 

Assembly-defective forms of phaseolin that remain mono- 
meric are unable to be transported along the secretory path- 
way in transgenic plants (Pedrazzini et al., 1997). During 
transient expression, the assembly-defective construct A360, 



3f 360 418 
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Figure 1. Comparison of the Amino Acid Sequence of Wild-Type p-Phaseolin and the Mutated Constructs Used in This Work. 
The single letter code from residue 341 to the C terminus is used. The numbers on top identify the positions of the residues, starting from the 
first methionine of the phaseolin precursor (signal peptide included). Dots indicate identical residues. Asterisks indicate the stop codons intro- 
duced in A418 and A3S0. The Thr-to-Phe mutation at position 343 destroys the consensus for N-glycosylation at position 341. 
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Figure 2. Transport-Competent Phaseolin Is in Part Secreted from 
Transiently Transfected Protoplasts but Not from Transgenic Proto- 
plasts. 

(A) Tobacco leaf protoplasts were transiently transfected with p las- 
mid encoding the phaseolin constructs T343F or A360 or with a 
plasmid without an insert (Co). Cells were pulse-labeled with 35 S- 
methionine and 35 S-cysteine for 1 hr and chased for the indicated 
periods of time. Cells and the corresponding incubation media were 
then homogenized, subjected to immunoprecipitation with the anti- 
phaseolin antiserum, and analyzed by SDS-PAGE and fluorography. 
The open arrowhead indicates intact T343F; the filled arrowhead in- 
dicates A360; and the vertical bar indicates phaseolin fragmentation 
products. Numbers at left indicate molecular mass markers in kilo- 
daltons. 

(B) Protoplasts from leaves of transgenic tobacco expressing T343F 
were subjected to pulse-chase and analyzed as described in (A). 
Symbols are as given in (A). 



which has a molecular mass of ~38 kD, was entirely recov- 
ered intracellulariy, and we could not detect it in the incubation 
medium (Figure 2A, lanes 5 and 6, and 1 1 and 1 2). During 
the 5-hr chase, the protein was in part degraded: we have 
established that this degradation does not involve trafficking 
along the secretory pathway but that it is due to the not yet 
fully clarified mechanism of ER quality control that retains 
and eventually degrades defective polypeptides (Pedrazzini 
et al., 1 997). Therefore, during transient expression, assem- 
bly-competent phaseolin is in part secreted, whereas a form 
of phaseolin subjected to quality control is not. 



Transgenic Tobacco Plants Do Not Secrete Phaseolin 

We have produced transgenic plants constitutively express- 
ing T343F (Pedrazzini et al., 1997). As shown in Figure 2B, 
mesophyll protoplasts isolated from these plants do not se- 
crete phaseolin to detectable amounts, suggesting fully effi- 
cient vacuolar sorting. 

To confirm that in intact mesophyll tissue the location of 
T343F is exclusively vacuolar and not extracellular, we ana- 
lyzed leaf tissue by immunoelectron microscopy. T343F was 
detected exclusively in the vacuolar lumen in large electron- 



dense protein bodies (Figure 3D) that could be decorated by 
the anti -phaseolin antiserum (Figures 3A and 3B) but not by 
the preimmune serum (Figure 3C). Similar protein bodies 
were not found in cells of tobacco transformed with the 
plasmid without an insert (data not shown). No labeling 
above the background was detectable in the cell wall or the 
intercellular spaces (Figures 3 A and 3B). From the results 
shown in Figures 2B and 3, we conclude that in transgenic 
leaves, T343F phaseolin is efficiently targeted to the vacu- 
oles, where it is fragmented and forms protein bodies and is 
not secreted at detectable levels. 



The Golgi Complex Mediates Secretion of Phaseolin 
from Transiently Transformed Protoplasts 

One possible explanation for the different behavior of 
phaseolin during transient versus permanent expression is 
that the high expression level reached in the transiently 
transformed cells saturates the vacuolar sorting mechanism 
of phaseolin, leading to default traffic along the secretory 
pathway to the cell surface. In the following experiments, we 
tested this hypothesis. 

We first determined whether secreted phaseolin travels 
through the Golgi complex, which is the usual route to se- 
cretion, rather than following some "nonclassical" route to 
the cell surface. The Golgi complex contains many glycosi- 
dases and glycosyltransferases, and one of its functions is 
the modification of gfycans of glycoproteins. When phaseo- 
lin has only one oligosaccharide chain, its glycan undergoes 
processing during passage through the Golgi complex 
(Sturm et al., 1987). This processing makes the glycan resis- 
tant to digestion by endoglycosidase H, a fungal enzyme 
that removes unprocessed Asn-linked glycans from glyco- 
proteins. Resistance to digestion by endoglycosidase H in 
vitro is widely used as a tool to investigate whether a protein 
has encountered Golgi complex-processing enzymes. 

Protoplasts transiently transformed with plasmid encod- 
ing T343F were subjected to pulse-chase labeling, and 
phaseolin was immunoprecipitated and treated with en- 
doglycosidase H or without the enzyme as control. At the 
end of the pulse, intact T343F was largely susceptible to en- 
doglycosidase H digestion, as indicated by its slightly lower 
apparent molecular mass after digestion (Figure 4, cf. lanes 
3 and 4). A very small proportion of phaseolin comigrated 
with the deglycosylated form even in the undigested control 
(this is visible in Figure 2A, lane 3, as well), indicating that a 
small proportion of polypeptides was synthesized in the un- 
glycosylated form. After 5 hr of chase, the phaseolin 
polypeptides that had not yet been fragmented remained 
largely susceptible to digestion, indicating that most of them 
had not yet traveled through the Golgi complex (Figure 4, 
lane 6). However, the apparent molecular mass of the three 
major fragmentation products was not affected by endogly- 
cosidase H (Figure 4, cf. the 20- to 25-kD fragments in lanes 
5 and 6). 
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Figure 3. In Transgenic Tobacco Leaf Cells, T343F Is Located in Vacuoles. 

(A) to (C) Thin sections of transgenic tobacco leaves expressing T343F were treated with anti-phaseolin antiserum JA] and [BJ) or preimmune 
serum (C), and the bound antibodies were visualized by using goat anti-rabbit 15-nm gold complex. 
(D) A lower magnification of an untreated sample. 

CH, chloroplast; CW, cell wall; IS, intercellular space; PB, protein bodies formed by phaseolin; V, vacuole. Bars = 1 jun. 
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Figure 4. Both Vacuolar and Secreted T343F Acquire Endoglycosi- 
dase H Resistance. 

Lanes 1 to 6 show endoglycosidase H treatment of phaseolin. Pro- 
toplasts were transiently transfected with the plasmid encoding 
T343F, pulse-labeled with 35 S-methionine and 35 S-cysteine for 1 nr. 
and chased for the indicated periods of time. Phaseolin was immuno- 
selected from the cell homogenates or the incubation media and 
then treated with endoglycosidase H (endo H; +) or without enzyme 
as control (-) and analyzed by SDS-PAGE and fluorography. Lanes 
5 and 6 contain material from a higher number of protoplasts than 
are contained in lanes 3 and 4. to allow clearer detection of the 
phaseolin fragmentation products. Numbers at left indicate molecu- 
lar mass markers in kilodaltons. Lanes 7 and 8 show the effect of 
tunicamycin on the relative molecular mass of the phaseolin frag- 
ments. Protoplasts were transiently transfected with plasmid encod- 
ing T343F and were subjected to pulse-chase in the presence of 
tunicamycin (Tm; +) or in the absence of the inhibitor (-). Pulse was 
for 1 hr with 35 S-methionine and 35 S-cysteine and was followed by 5 
hr of chase. At the end of the chase, phaseolin was immunoselected 
from the cell homogenates and analyzed by SDS-PAGE and fluorog- 
raphy. Only the portion of the gel relative to the phaseolin fragments 
is shown. 



To determine whether any of the fragments contain the 
glycan, we performed radioactive labeling of protoplasts in 
the presence of tunicamycin, a fungal inhibitor of N-glycosyt- 
ation. In the presence of the inhibitor, there was a marked 
decrease in the apparent molecular mass of the faster mi- 
grating phaseolin fragment, indicating that in normal condi- 
tions, this fragment is glycosylated, whereas the two slower 
migrating ones are not (Figure 4, cf. lanes 7 and 8). There- 
fore, in tobacco, as in bean cotyledons, the glycan of singly 
glycosylated phaseolin acquires a complex structure along 
the route to the vacuole. Secreted T343F was almost com- 
pletely resistant to endoglycosidase H (Figure 4, cf. lanes 1 
and 2). The proportion of resistant polypeptides was much 
higher in secreted T343F than in the intracellular T343F that 
was still unfragmented after the chase, ruling out the possi- 
bility that secretion of phaseolin resulted from some non- 
classical "shortcut" delivery from the ER directly to the cell 
surface. We conclude that the Golgi complex is involved in 
the secretion of phaseolin. 



Phaseolin Is Secreted Only When Expressed 
at High Levels 

If secretion of phaseolin results from saturation of the vacu- 
olar targeting mechanism, we expected a reduction in se- 
cretion when the level of expression was lowered. As a test, 
we transformed protoplasts with different amounts of plas- 
mid (Figure 5). In our standard transformation protocol, we 
use 40 p,g of plasmid to transform 10 6 protoplasts. This leads 
to secretion levels of T343F that vary from 20 to 50% after 5 
hr of chase, using different preparations of protoplasts. Fig- 
ure 5 shows that when 2 p,g was used, secretion was below 
our limit of detection, although phaseolin was still sorted to 
the vacuole, as indicated by the accumulation of fragmenta- 
tion products during the chase. When 20 fig of plasmid was 
used to transform the same preparation of protoplasts, ex- 
pression of T343F was six times higher and part of the pro- 
tein was secreted (Figure 5). 

Therefore, the secretion of phaseolin is dose dependent 
and it occurs when protoplasts are transiently transformed, 
but it does not occur when protoplasts are isolated from 
transgenic plants and are not subjected to transient trans- 
formation. This could indicate two possibilities: (1) the stress 
imposed on the cells by the transient transformation proce- 
dure has the side effect of altering the capacity of the vacu- 
olar sorting mechanism; and (2) secretion is truly caused by 
the sudden burst of phaseolin expression, which saturates 
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Figure 5. Secretion of T343F Is Dose Dependent. 

Tobacco protoplasts were transiently transfected with 20 or 2 \ug of 
the plasmid encoding T343F and then pulse-labeled for 1 hr with 
35 S-methionine and 3S S-cysteine and chased for the indicated peri- 
ods of time. Cell homogenates and the corresponding incubation 
media were then immunoprecipitated with the anti- phaseolin antise- 
rum and analyzed by SDS-PAGE and fluorography. Numbers at left 
indicate molecular mass markers in kilodaltons. At bottom is a 
longer exposure with respect to the fluorograph at top, to allow 
clearer detection of the phaseolin fragmentation products (indicated 
by the vertical bar at right). 
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the normal vacuolar sorting capacity of the cells. To distin- 
guish between these possibilities, we supertransformed pro- 
toplasts isolated from transgenic plants expressing T343F 
with the plasmid containing the same construct or with the 
plasmid without insert as a control. Transfection with the 
control plasmid did not change the destiny of phaseolin, 
whereas the over expression of phaseolin resulted in secre- 
tion (Figure 6, lanes 7 and 8; cf. with lanes 5 and 6). The very 
small amount of intact phaseolin recovered in the medium 
after the pulse shown in lane 5 represents contamination 
from a small proportion of broken cells; consistently, there is 
no secreted phaseolin after the chase shown in lane 6. 

The average overexpression of T343F as a result of the 
supertransformation was ~2.5-fold in the experiment shown 
in Figure 6. However, overexpression is probably much 
higher in the cells receiving the transfected DNA, because 
these are usually a small proportion of the total (Denecke et 
al., 1989). Secretion of the extra amount of phaseolin syn- 
thesized because of the transient transformation was almost 
quantitative. The amount of fragmentation products accu- 
mulated intracellularly after the chase was very similar in 
control-transformed and supertransformed protoplasts (Fig- 
ure 6, cf . lanes 2 and 4). These results indicate that secretion 
is the result of saturation of the sorting machinery and that 
the mechanism that targets phaseolin to the vacuole is very 
close to its saturation in the transgenic plants. 

Deletion of the C-Terminal Sequence Ala-Phe-Val-Tyr 
Causes Complete Secretion of Phaseolin in Transgenic 
Tobacco Leaves 

The results presented above suggest that the vacuolar sort- 
ing of phaseolin occurs because of recognition events that 
are saturable. This is consistent with the presence of a re- 
ceptor mechanism that recognizes specific structural fea- 
tures of phaseolin. Signals for sorting to the plant vacuoles 
have been identified on several proteins. There is not a 
unique consensus; however, a high percentage of hydro- 
phobic residues is a characteristic of some of the known 
C-terminal vacuolar sorting signals (Bednarek et al., 1990; 
Neuhaus et al., 1991; Saalbach et al., 1996). When analyzing 
the phaseolin sequence, we observed that there was an en- 
richment in hydrophobic amino adds at its C terminus: Lys 
is followed by the sequence Gly-Ala-Phe-Val-Tyr (Figure 1). 
Therefore, we tested whether the hydrophobic C-terminal 
end of phaseolin contains information necessary for vacu- 
olar targeting. We introduced a stop codon after residue 417 
of T343F to generate T343FA418 (hereafter referred to as 
A418; Figure 1). Thus, A418 lacks the last four amino acid 
residues, Ala-Phe-Val-Tyr, which form a highly hydrophobic 
sequence. We produced transgenic tobacco plants ex- 
pressing A41 8 under the control of the cauliflower mosaic 
virus 35S promoter. 

We isolated total proteins from transgenic leaves and ana- 
lyzed them by SDS-PAGE, which was followed by immuno- 



blot analysis with the anti-phaseolin antiserum. The results 
are shown in Figure 7A. Plants expressing A418 accumu- 
lated a major phaseolin polypeptide with an H of ~ 4 5,000 
and a minor one with a slightly lower relative molecular 
mass. These polypeptides are of the relative mass expected 
for intact phaseolin and may represent the glycosylated and 
unglycosylated forms of the protein. Immunoblot analysts of 
an extract from tobacco leaves expressing T343F is also 
shown in Figure 7A for comparison. Clearly, the typical 
phaseolin fragmentation products present when the protein 
is synthesized in transgenic plants are absent in A418- 
expressing plants. 

Because the fragments in the 20- to 25-kD range are 
formed when phaseolin is targeted to the tobacco vacuoles, 
we investigated whether A418 accumulates in the apoplast 
instead of being delivered to vacuoles. We were able to vi- 
sualize A418 and T343F on silver- stained SDS-PAGE gels, 
after immunoprecipitation with the anti-phaseolin antiserum 
(Figure 7B). When phaseolin was immunoprecipitated from 
total leaf homogenates, A41 8 and the fragmentation prod- 
ucts of T343F were clearly detectable (Figure 7B, lanes 1 
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Figure 6. Vacuolar Sorting of T343F in Transgenic Protoplasts Is 
Saturable. 

Protoplasts from leaves of transgenic tobacco expressing T343F 
were transfected with either the plasmid without an insert (T343F/ 
Co) or the plasmid encoding T343F (T343F/T343F). Cells were sub- 
jected to a 1-hr pulse with "S-methionine and 35 S-cysteine and 
chased for the indicated periods of time. Cell homogenates and the 
corresponding incubation media were immunoprecipitated with the 
anti-phaseolin antiserum and then analyzed by SDS-PAGE and fluo- 
rography. Numbers at left indicate molecular mass markers in kilo- 
da Itons. 
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Figure 7. A418 Is Completely Secreted from Leaf Cells of Trans- 
genic Tobacco. 

(A) Leaves from transgenic tobacco expressing T343F or A418 
(three independent transgenic plants) or transformed with the plas- 
mid without inserts (Co) were homogenized, and the homogenates 
were analyzed by SDS-PAGE, followed by protein gel blotting and 
immunodetection with the anti-phaseolin antiserum. Lanes 1 to 4 
and 5 and 6 represent independent immunodetections. 

(B) Whole-leaf homogenates or protoplast (protop.) homogenates 
from plants expressing T343F or A418 were immunoprecipitated 
with the anti-phaseolin antiserum and analyzed by SDS-PAGE. The 
gels were then stained with silver stain. The arrowhead indicates the 
position of intact phaseolin. The positions of the heavy (H) and light 
(L) chains of the antibodies used for immunoprecipitation are indi- 
cated. _ 

(C) Protoplasts from transgenic tobacco plants expressing T343F or 
A418 were pulse-labeled for 1 hr with 35 S- methionine and 35 S- cysteine 
and chased for the indicated periods of time. Cell homogenates and 
the corresponding incubation media were immunoprecipitated with 
the anti-phaseolin antiserum and then analyzed by SDS-PAGE and 
fiuorography. 

In (A) to (C), the vertical bar indicates phaseolin fragmentation prod- 
ucts. Numbers at left indicate molecular mass markers in kilodaltons. 



and 2; there was insufficient intact T343F to be detected, 
indicating that its proportion with respect to the T343F frag- 
ments was overestimated by immunoblot analysis). How- 
ever, when phaseolin was immunoprecipitated from leaf 
protoplast preparations, A418 was absent whereas the vac- 
uolar fragments of T343F were, as expected, still present 
(Figure 7B, lanes 3 and 4). This indicates that A418 accumu- 
lates in the apoplast. 

To study the trafficking of A418, we subjected protoplasts 
isolated from A418 transgenic leaves, or from T343F trans- 
genic leaves as a control, to pulse-chase analysis, followed 
by immunoprecipitation with the anti-phaseolin antiserum, 
SDS-PAGE, and fiuorography (Figure 7C). During the chase, 
A418 was secreted from the cells and accumulated in the 
extracellular medium; no fragmentation products were de- 
tectable. A very small amount of fragmentation products be- 
came visible in the intracellular A418 immunoprecipitates 
upon extremely long film exposures, indicating that a very 
small proportion of the mutated phaseolin can reach the 
vacuoles and that the protein can be fragmented. This pro- 
portion of mutated phaseolin, however, constitutes much 
less than 1% of total immunoprecipitable phaseolin (data 
not shown). 



The C Terminus of Phaseolin Is Exposed on the Surface 
of the Protein 

The fact that A418 is secreted could be due to different 
causes. (1) The last four amino acids are an essential part of 
the vacuolar sorting signal for phaseolin, and their deletion 
directly abolishes recognition by the sorting machinery. And 
(2), removal of the four residues causes defects to the over- 
all structure of phaseolin, and secretion could then be due 
to a wrong conformation of the phaseolin molecules; these 
could be no longer recognized by the vacuolar sorting ma- 
chinery, because of an altered surface exposure of a sorting 
signal constituted by amino acids other than the deleted ~ 
ones. The three-dimensional structure of phaseolin has 
been established in large part (Lawrence et al., 1 990, 1 994); 
however, no information is available for the last 22 amino 
acids at the C terminus or for a few other segments of the 
protein. 

We devised experiments to assess the oligomerization 
state of A418 phaseolin and the exposure of the C terminus 
of T343F phaseolin, thereby attempting to distinguish be- 
tween the two possibilities mentioned above. We transiently 
transformed protoplasts with plasmids encoding T343F, 
A418, or A360 and subjected them to 1 hr of pulse-labeling. 
We then loaded the cell homogenates onto a 5 to 25% su- 
crose sedimentation velocity gradient. Phaseolin was then 
immunoprecipitated from each gradient fraction (Figure 8). 
The position of A418 along the gradient matched the posi- 
tion of the trimeric, transport-competent T343F exactly. In 
Figure 8, the assembly-defective mutant A360 (Pedrazzini et 
al., 1997) marked the position of phaseolin monomers along 
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Figure 8. A418lsTrimeric. 

Tobacco protoplasts were transiently transfected with plasmids en- 
coding T343F. A418. or A360 and pulse-labeled for 1 hr with 35 S- 
methionine and 35 S-cysteine. Cell homogenates were subjected to 
sedimentation velocity fractionation on a continuous 5 to 25% [w/v] 
sucrose gradient Gradient fractions were subjected to immunopre- 
cipitation with the anti-phaseolin antiserum and analyzed by SDS- 
PAGE and fluorography. "The top of the gradients is at left Only the 
portions of the gels containing phaseolin are shown. 



the gradient. We thus conclude that A418 is competent for 
rapid and efficient trimerization and indistinguishable by ve- 
locity centrifugation from normal T343F phaseolin. There- 
fore, the absence of the four C -terminal residues does not 
alter the oligomeric state of phaseolin. 

Secretion of T343F during transient expression is accom- 
panied by a late and slight decrease in molecular weight 
This can barely be seen when comparing lanes 9 and 10 of 
Figure 2A or lanes 5 and 6 of Figure 5, but it becomes evi- 
dent when SDS-PAGE is performed for longer periods of 
time, allowing more pronounced separation of the different 
polypeptides, as shown in Figure 9, lanes 1 and 2. When 
synthesis was allowed to occur in the presence of the inhib- 
itor of N-glycosytation tunicamycin, the time-dependent de- 
crease in the molecular weight of secreted T343F was still 
evident indicating that it is not due to glycan trimming (Fig- 
ure 9, lanes 3 and 4). However, secreted A418 did not un- 
dergo any decrease in molecular weight during the chase 
(Figure 9, cf. lanes 5 and 6, and 7 and 8). This strongly sug- 
gests that secreted T343F undergoes removal of a few 
amino acids at the C terminus and that therefore the C-ter- 
minal tetrapeptide required for vacuolar sorting is exposed 
on the surface of the assembled protein. By using SDS- 
PAGE analysis, we could not establish whether intracellular. 
T343F deposited into the vacuoles also undergoes this pro- 
cessing, because of the major fragmentation event that ac- 
companies vacuolar deposition. 

The results presented in Figures 8 and 9 suggest the ex- 
istence of a direct interaction between the C terminus of 
phaseolin and the vacuolar sorting mechanism. 



DISCUSSION 

We have shown here that vacuolar sorting of phaseolin in to- 
bacco leaf cells is very efficient in transgenic plants but can 



be saturated when high levels of phaseolin are expressed in 
transiently transformed protoplasts. Saturation leads to 
Gokji-mediated secretion of the excess of protein. We have 
also shown that deletion of the last four C -terminal amino 
acid residues abolishes sorting of phaseolin to the vacuole 
and leads to its complete secretion. Therefore, vacuolar 
sorting of phaseolin, a storage protein of the 7S (vicflin) 
class, is most likely mediated by a saturable receptor sys- 
tem that recognizes a signal located at the C terminus of its 
ligand protein. 



Vacuolar Sorting of Phaseolin 

In the transgenic tobacco plants that we produced, T343F 
phaseolin was not secreted to detectable levels but almost 
saturated the vacuolar sorting mechanism: no additional 
vacuolar phaseolin was detectable upon transient overex- 
pression, and excess phaseolin was secreted. It is indeed 
possible that these transgenic plants have already slowly 
adapted to handle the excess of vacuolar protein that they 
are synthesizing. A further increase in the synthesis of 
phaseolin could require further adaptation that is not 
achieved during the short transient expression experiments. 
This would be a possible explanation for the otherwise fortu- 
nate coincidence between the level of phaseolin synthesis 
and the capacity of its sorting mechanism in transgenic to- 
bacco leaves. 

A saturable targeting mechanism is consistent with it be- 
ing receptor mediated. In yeast the receptor-mediated vac- 
uolar targeting of proteinase A and carboxypeptidase Y is 
saturable: when expressed at artificially high levels, these 
two proteins are secreted (Rothman et at, 1986; Stevens et 
al., 1 986). A putative receptor that recognizes some but not 
all plant vacuolar targeting signals in vitro has been identi- 
fied in pea, but it is not known whether this receptor recog- 
nizes phaseolin (Kirsch et al., 1996; Paris et al., 1997). 
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Figure 9. Secreted T343F, but Not A418, Undergoes Post-Transla- 
tional Proteolytic Processing. 

Tobacco protoplasts were transiently transfected with plasmids en- 
coding T343F or A418 and subjected to pulse-chase in the pres- 
ence (+) of tunicamycin (Tm) or in the absence (-) of the inhibitor. 
Pufee was f or 1 hr with 35 S-methionine and 35 S-cysteine, and chase 
was for 0 or 5 hr. The incubation media were then immunoprecipi- 
tated with the anti-phaseolin antiserum and analyzed by SDS-PAGE 
and fluorography. Note the increase in mobility of T343F, but not of 
A418, between the pulse and the chase. 
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Transient expression in tobacco protoplasts has been 
used to study vacuolar sorting of other plant proteins. Tar- 
geting of the wild-type forms of barley lectin (Bednarek et 
al. r 1990) and Brazil nut 2S albumin (Saalbach et al., 1996) 
was not saturated, but recombinant vacuolar chitinase of to- 
bacco showed partial secretion that was directly propor- 
tional to its level of expression (Neuhaus et al., 1994). 
Because more than one targeting mechanism exists for 
plant vacuoles (Matsuoka et al., 1995), it is possible that not 
all of them are saturated at similar levels of protein synthe- 
sis; the different plasmids used for transfection may also 
lead to different expression levels, but saturability is clearly 
not unique to phaseolin. 

The saturation of the vacuolar targeting of phaseotin sug- 
gests either that a component of the sorting machinery is 
not sufficiently abundant to handle the number of phaseolin 
molecules synthesized or that above a certain concentration, 
phaseolin undergoes a change in conformation that inhibits 
recognition by the sorting machinery. Electron microscopy 
shows that in tobacco leaf vacuoles, as in bean storage vac- 
uoles, phaseolin forms protein bodies, probably because of 
the acidic environment. Currently, however, we have no evi- 
dence that upon transient expression, similar structures or 
aggregates of phaseolin are formed during its transport be- 
fore being deposited into the vacuole: we were unable to 
detect any phaseolin aggregates by velocity gradient cen- 
trifugation, and the proteolytic trimming of secreted T343F 
phaseolin strongly suggests that the sequence necessary 
for its vacuolar sorting remains available for interactions with 
the sorting machinery. 

Further support of this hypothesis comes from digestion 
of secreted T343F phaseolin with endoglycosidase H, which 
allowed us to determine that in plant cells, secretion of an 
overexpressed vacuolar protein occurs via the Golgi- medi- 
ated pathway and not via an alternative route. The fact that the 
vast majority of secreted T343F phaseolin molecules are re- 
sistant to endoglycosidase H digestion also indicates that 
their glycan is accessible to Golgi-processing enzymes: this 
confirms that at the time of passage through at least some of 
the Golgi stacks, most of the phaseolin that will be secreted 
is not aggregated. Therefore, if secretion is due to aggrega- 
tion, this is a late and transient event during the trafficking of 
phaseolin, and the aggregates are easily disrupted during 
"analysis. Therefore, we suggest that secretion is instead due 
to insufficient abundance of the sorting machinery. 

The fact that the glycan of T343F acquires a complex 
structure both in secreted phaseolin and in phaseolin de- 
posited into the vacuole has implications for the definition of 
the routes that lead to the two different locations as well. 
Analogous to the targeting to the lytic compartments of other 
eukaryotes, it is assumed that sorting of vacuolar proteins 
from secretion occurs in the frans-Golgi network, but this 
has not been proven directly. Because in plant cells N-linked 
glycans acquire complex structures in the medial and trans- 
Golgi complex (Fitchette-Lain6 et al., 1994), our data indi- 
cate that the two routes do not fork before at least the c/s- 



and medial Golgi complex stacks have been visited. This con- 
firms the previous observation that complex glycans are 
present both on secreted and on vacuolar plant proteins (Faye 
et at., 1989; Vitate and Bollini, 1995) and extends it to the 
synthesis of an individual protein in cells from a single tissue. 



The Sorting Signal 

The sequence Ala-Phe-Val-Tyr is uncharged and mainly hy- 
drophobic. Phaseolin is the product of a small gene family, 
and the tetrapeptide is conserved in both a and p polypep- 
tides (Slightom et at., 1985). Short propeptides at the C ter- 
minus are known to determine the vacuolar sorting of 
tobacco chitinase (seven amino acids; Neuhaus et al., 1991) 
and Brazil nut 2S albumin (four amino acids; Saalbach et al., 
1996). A longer C-terminal propeptide (1 5 amino acids) con- 
tains the vacuolar sorting signal of barley lectin (Bednarek et 
al., 1990). A vacuolar sorting consensus sequence cannot 
be derived, but these propeptides are also rich in hydropho- 
bic amino acids. Mutational analysis revealed the impor- 
tance of hydrophobic residues in barley lectin (Dombrowski 
et al., 1993) but not in chitinase (Neuhaus et al., 1994). It 
should also be considered that in the native plants, the 7S 
and 2S storage proteins and barley lectin accumulate in 
storage vacuoles, whereas chitinase is located in vegetative 
vacuoles. Transport from the Golgi complex to the two 
types of vacuoles involves different structures (Chrispeels, 
1983; Hohl et al., 1996). Eventually, in mature mesophyll 
cells, the two types of vacuoles probably merge (Schroeder 
et al., 1993; Paris et al., 1996), but it is possible that the 
mechanisms recognizing the C-terminal vacuolar sorting 
signals are more than one and are distinct for vegetative and 
storage proteins. 

When T343F is secreted during transient expression, the 
C terminus is most probably trimmed by hydrolases that 
also are secreted by the protoplasts, indicating that it is ex- 
posed on the surface of phaseolin trimers. Whether this 
trimming also occurs upon delivery to the vacuole cannot be 
established by SDS-PAGE analysis, because vacuolar 
phaseolin is fragmented in tobacco cells. However, phaseo- 
lin accumulated in the storage vacuoles of bean cotyledon- 
ary cells lacks a few amino acids with respect to the 
ER-associated precursor (Bollini et al:, 1982; D'Amico et al.7 
1992). The N -terminal sequence of phaseolin purified from 
bean cotyledons is the one predicted after signal peptide re- 
moval, indicating that trimming must be at the C terminus 
(Paaren et al., 1987). Therefore, the Ala-Phe-Val-Tyr se- 
quence probably constitutes or is part of a short transient 
peptide and is most likely exposed on the surface of 
phaseolin trimers. This makes the sequence a good candi- 
date for direct interactions with the vacuolar sorting machin- 
ery and does not favor the alternative hypothesis that 
deletion of the sequence affects the overall conformation of 
phaseolin and inhibits recognition by the sorting machinery 
only as an indirect effect. 
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Secretion is a safe haven for phaseolin: indeed, A418 ac- 
cumulates in its intact form. When wild-type phaseolin is ex- 
pressed in transgenic plants, the default targeting to the 
vacuole leads to its proteolytic cleavage in all tissues tested, 
with the only exception being rice endosperm (Murai et al., 
1983; Sengupta-Gopalan et al., 1985; Bagga et al., 1992; 
Zheng et al.. 1995; Pedrazzini et al., 1997). Comparisons of 
the amounts of T343F and A418 accumulated in leaves are 
beyond the scope of this work, but our data suggest that se- 
cretion could be a very effective means to preserve phaseo- 
lin from degradation. This could be considered as a more 
general, useful strategy when planning the expression of 
heterologous proteins of biotechnological interest in the 
plant secretory pathway. A strategy that has been success- 
ful in enhancing the stability of foreign vacuolar proteins is 
their accumulation in the ER via the KDEL/HDEL system 
normally used by soluble ER residents (Wandelt et al., 1992; 
Tabe et al., 1995; Khan et al, 1996). It would be interesting 
to compare accumulation in the ER versus the apoplast by 
using the same passenger proteins. 

Vacuolar Sorting and ER Quality Control 

The fact that A418 phaseolin is not sorted to the vacuole but 
is secreted means that the assembly-defective phaseolin 
constructs A363 and A360 (Pedrazzini et al., 1997), which 
are the results of more extensive C-terminal deletions, do not 
possess the vacuolar targeting signal of trimeric phaseolin 
as well. On the contrary, HiMet phaseolin, a trimeric but un- 
stable mutated phaseolin that is transported to the vacuole 
and rapidly degraded there, still bears the sequence Ala- 
Phe-Val-Tyr (Hoffman et al., 1988; Pueyo et al., 1995). As- 
sembly-defective phaseolin is subjected to ER quality 
control: it is extensively retained in the ER, where it shows 
prolonged interactions with the chaperone BiP and then is 
slowly degraded (Pedrazzini et al., 1 997). The location where 
assembly-defective phaseolin is degraded is not known. 
Possible candidates are a subcompartment of the ER or the 
cytosol after dislocation from the ER; another candidate is 
the vacuole, which would be reached by a Golgi- 
independent pathway (Pedrazzini et al., 1997). The results 
we have presented here indicate that in whichever compart- 
ment degradation occurs, targeting of assembly-defective 
phaseolin to degradation by quality control does not depend 
on the signal that normally sorts phaseolin to the hydrolytic 
compartments of the plant cell. 



METHODS 



Recombinant DNA 

The strategies to construct T343F and A360 phaseolin have been de- 
scribed previously (Pedrazzini et al., 1997). To construct A418, the 
antisense oligonucleotide 5 * - CTGC AGTCAACCCT T TCT TCCCTT - 



TTGC-3' was used in polymerase chain reaction to introduce a stop 
codon (underlined) after residue 417 in the T343F phaseolin coding 
sequence. The construct thus lacks the last four residues of wild- 
type phaseolin (Ala. Phe. Val. and Tyr). Ml 8 was introduced into the 
vector pDHA (Tabe et at. 1 995) for transient expression experiments. 
For constitutive expression in transgenic tobacco, pDHA containing 
A418 was inserted into the Hindlll site of the binary vector pGA470 
(An et al., 1985), which was then used to transform Agrobacterium 
tumefaciens LBA4404 by electroporation. 



production of Transgenic Tobacco Plants and Transient 
Transformation of Leaf Protoplasts 

The transformed agrobacterium was used to produce transgenic to- 
bacco (Nicotiana tabacum cv Petit Avana SR1) plants, as described 
by Pedrazzini et al. (1997). 

Protoplasts were prepared from axenic leaves (4 to 7 cm long) of 
untransformed tobacco or from transgenic tobacco expressing T343F 
or A418 phaseolin. Protoplasts were subjected to polyethylene gly- 
col-mediated transfection. as described previously (Pedrazzini et al.. 
1997). Vector pDHA without inserts was used as a negative control 
for transfection. Unless otherwise stated, 40 yug of plasmid was used 
to transform 10 6 protoplasts in 1 mL After transfection, protoplasts 
were allowed to recover overnight in the dark at 25°C in K3 medium 
(Pedrazzini et al., 1997), at a concentration of 10 6 cells/mL, before 
performing pulse-chase experiments. 



Pulse-Chase Labeling of Protoplasts and Analysis of Phaseolin 

Pulse-chase labeling of protoplasts using Pro-Mix (a mixture of 
35 S-methionine and 35 S-cysteine; Amersham) was performed as de- 
scribed previously (Pedrazzini et al.. 1997). In some experiments, be- 
fore radioactive labeling, protoplasts were incubated for 45 min at 
25°C in K3 medium supplemented with 50 pg/mL tunicamycin (stock 
solution 5 mg/mL in 10 mM NaOH, stored at 4°C; Boehringer Mann- 
heim) and maintained in the presence of the inhibitor for the entire 
experiment. At the desired pulse-chase time points. 3 volumes of W5 
medium (154 mM NaCI, 5 mM KCI. 125 mM CaCV2H 2 0, and 5 mM 
glucose) was added, and the samples were centrifuged for 5 min at 
50g. The supernatant containing the proteins secreted into the incu- 
bation medium, was removed, leaving 100 nL to cover the proto- 
plasts. The supernatant and protoplasts were frozen separately in 
liquid nitrogen and stored at -80°C. Homogenization of the frozen 
samples and immunoprecipitation of phaseolin using rabbit antise- 
rum raised against phaseolin purified from bean cotyledons were 
performed as described previously (Pedrazzini et al., 1997). Unless 
otherwise stated, for each pulse-chase time point immunoprecipita- 
tion was performed using cell homogenate and incubation medium 
from the same amount of protoplasts. Digestion of immunoprecipi- 
tated proteins with endogiycosidase H (recombinant; Boehringer Mann- 
heim) was performed as described previously (Ceriotti et al., 1991). 

For the analysis of phaseolin assembly by sedimentation velocity, 
after radioactive labeling and hornogenation, homogenates were 
brought to 8 mM MgCfe and 3 mM ATP and loaded on top of a con- 
tinuous 5 to 25% [w/vj linear sucrose gradient made in 1 50 mM NaCI. 
1 mM EDTA. 0.1% Triton X-100, and 50 mM Tris-CI, pH 7.5. Samples 
were centrifuged at 39,000 rpm in an SW40 Ti rotor (Beckman Instru- 
ments, Inc., Fullertoa CA) for 20 hr at 20°C. Phaseolin was then im- 
munoselected from each gradient fraction. 
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SDS-PAGE analysts was always performed on 15% acrylamide 
gels. Rainbow ,4 C-methylated proteins (Amersham) were used as 
molecular weight markers. Gels were treated with 2. 5-diphenyloxazote 
dissolved in dimethyl sulfoxide and radioactive polypeptides revealed 
by fluorography. Quantification of the relative intensities of bands in 
the ftuorographs was performed by microdensitometry, using a Ca- 
mag TLC Scanner II (Camag, Muttenz, Switzerland). Care was taken 
to use film exposures that were in the linear range of film darkening. 



Total Leaf Protein Extraction and Protein Gel Blot Analysis 

For analysis of total proteins, leaves from transformed plants were 
homogenized in an ice-cold mortar with 0.2 M NaCI, 1 mM EDTA, 2% 
3-mercaptoethanol. 0.2% Triton X-100. and 0.1 M Trrs-CI, pH 7.8. 
supplemented with Complete (Boehringer Mannheim) protease in- 
hibitor cocktail. The buffer/tissue ratio was 6:1 (milliliters per gram 
fresh weight of tissue). The homogenate was centrifuged at 1 500 g at 
4°C for 10 min, and the supernatant was separated by SDS-PAGE. 
Immunodetection was performed using the enhanced chemilumines- 
cence (ECL) system (Amersham), following the manufacturer's proto- 
col. Proteins were detected using the anti-phaseolin antiserum. 

To determine whether A41 8 phaseolin is located intracellularty or 
extracellularry, we immunoselected protoplasts or leaf extracts from 
transgenic plants with the anti-phaseolin antiserum, and the selected 
material was analyzed by SDS-PAGE followed by silver staining. 



Immunomicroscopy 

Leaf fragments (1 to 2 mm) were fixed in 0.8% glutaraldehyde and 
3.3% paraformaldehyde in 0.1 M phosphate buffer, pH 7.4, at 4°C for 
2 hr and in 1% Os0 4 in the same buffer for 2 nr. They were then de- 
hydrated in an ethanol series and embedded in Spurr's resin (Spurr, 
1969). Thin sections were etched with 1% aqueous periodic. acid and 
incubated overnight at 4°C in the presence of either the anti-phaseo- 
lin antiserum, diluted l 1: 1000 with PBS-BSA-C (Aurion, Wageningen, 
The Netherlands), or preimmune serum as control. After washing, the 
samples were incubated for 1 hr at room temperature in goat anu- 
rabbit 15-nm gold complex (1:20; Biocell, Cardiff, UK) and stained 
with 2% uranyl acetate and lead citrate. 
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ABSTRACT Plant cell vacuoles may have either storage or 
degradative functions. Vegetative storage proteins ( VSPs) are 
synthesized in response to wounding and to developmental 
switches that affect carbon and nitrogen sinks. Here we show 
that VSPs are stored in a unique type of vacuole that is derived 
from degradative central vacuoles coincident with insertion of 
a new tonoplast intrinsic protein (TIP), 6-TIP, into their 
membranes. This finding demonstrates a tight coupling be- 
tween the presence of 5-TIP and acquisition of a specialized 
storage function and indicates that TIP isoforms may deter- 
mine vacuole identity. 



Uniquely, in contrast to yeast or mammalian cells, plant cells 
may contain separate vacuoles with protein storage and di- 
gestive functions (1, 2). It is not known how functionally 
distinct vacuoles are generated or maintained. Plant vacuole 
tonoplast membranes contain abundant tonoplast intrinsic 
proteins (TIPs) that may function as aquaporins (3), but the 
quantities present seem to be far in excess of what is required 
for water transport (4). Protein storage vacuoles (PSVs), 
containing seed-type storage proteins, are marked by the 
presence of a-TIP, and lytic or degradative vacuoles (LVs) are 
marked by the presence of 7-TIP (1, 2). These observations 
indicate that a specific TIP isoform correlates with a specific 
vacuole function (5). A further test of this hypothesis would 
require demonstration that a third functionally distinct vacuole 
carried a different TIP isoform. Here we define a third 
functionally distinct vacuole in plant cells and demonstrate that 
it is marked specifically by 5-T1P. 

MATERIALS AND METHODS 

Antibodies and Immunocytochemistry. Anti-a-TTP protein 
antiserum (6) and VM23 antiserum to purified y-TIP protein 
from radish root (7) were generously provided by M. 
Chrispeels (6) and M. Maeshima (7), respectively. Synthetic 
peptides were synthesized and antisera to the peptides coupled 
-to keyhole limpet hemocyanin were generated by Quality 
Controlled Biochemicals, Hopkinton, MA For antibody pu- 
rification, the peptides were coupled via an amino-terminal 
Cys residue to SulfoLink agarose (Pierce) according to the 
manufacturer's instructions, and peptide-specific antibodies 
were affinity-purified as described previously (8) for use in all 
procedures. Antisera to proteinase inhibitor I (Inh I) and II 
(Inh II) have been described previously (9-11). Huorochrome- 
tagged secondary antibodies were purchased from Jackson 
ImmunoResearch. Plant tissues were fixed in either formal- 
dehyde/acetic acid/ethyl alcohol or 3.7% paraformaldehyde, 
and paraffin-embedded sections were prepared as described 
(12). After removal of paraffin and rehydration, the sections 
were blocked as described (2). The double-label protocol to 
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identify two different rabbit antibodies separately with immu- 
nofluorescence has been described (2). Briefly, the first pri- 
mary antibody was completely covered by incubation with an 
excess of anti-rabbit F(ab')2 fragment coupled to lissamine 
rhodamine before incubating with the second primary anti- 
body and detection with a fluorescein isothiocyanate (FTTC)- 
coupled secondary antibody. Slides were viewed and photo- 
graphed under epifluorescence with an Olympus BX-50 mi- 
croscope containing a multifilter set (no. 61002, Chroma 
Technology, Brattleboro VT) that permits simultaneous view- 
ing of blue (462 nm), green (531 nm), and red (627 nm) 
emissions that are excited at 402 nm, 496 nm, and 571 nm, 
respectively. This filter set converts most background fluores- 
cence from plant tissues, including that from chlorophyll, to 
gray, and thus facilitates identification of fluorescence result- 
ing from specific antibodies. Photographic prints were scanned 
into a computer, and images representing individual red 
(lissamine rhodamine) and green (FITC) emissions were gen- 
erated by using adobe Photoshop software (Adobe Systems, 
Mountain View, CA). 

Plant Material. Growth of soybean plants and depodding to 
induce vegetative storage protein (VSP) accumulation has 
been described (13-15). Tomato, petunia, and tobacco plants 
were grown in Washington State University greenhouses, and 
other plant materials were purchased from a local supermar- 
ket. Tissue extracts for Western blot analyses (16) were 
prepared by homogenization at 25°C in 0.0625 M Tris-HCI, pH 
6.8/2% SDS with a Polytron-type homogenizer (PRO250; 
PRO Scientific, Monroe, CT). After homogenization the 
extracts were allowed to sit for 1 hr at 25°C and then clarified 
by centrifugation at 12,000 X g. 

RESULTS 

Previous studies to identify specific TIP isoforms in vacuoles 
used antiserum to a-TIP, a single protein purified from bean 
seeds (6) and VM23 antiserum recognizing a purified *y-TTP 
protein purified from radish _root .(7). - We noted . that the 
carboxyl-terminal, cytoplasmic-tail sequences of the different 
TIPs varied such that it was possible to make antibodies 
specific for each: 7-TIP, THEQLPTTDY (barley, GenBank 
accession no. X80266; rice, D25534; Arabidopsis, Z34662 and 
X72581; radish, D84669); a-TIP, HQPLATEDY (Phaseolus, 
X62873) and HQPLAPEDY (Arabidopsis, X63551); DIP (17), 
HTPLPTSEDYA (Antirrhinum, X70417) and HAPLPT- 
SEDYA (potato, U65700; tobacco, X54855); and 5-TIP, HV- 
PLASADF (Ambidopsis, U39485). Accordingly, rabbits were 
immunized with the following peptides coupled to keyhole 
limpet hemocyanin: 7-TIP, CSRTHEQLPTTDY; a-TIP, 
C(aminohexanoic acid)HQPLAPEDY; DIP, CHTPLPT- 
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SEDYA; and 5-TIP, CHVPLASADF, and affinity-purified 
antibodies were tested on dot-blots containing the different 
peptides coupled to BSA. Each antibody preparation was 
specific, having at least a 100-fold-higher affinity for its 
corresponding peptide than for any of the other peptides (data 
not presented). Use of the anti-DIP peptide antibodies will be 
described elsewhere. 

Anti-TIP Peptide Antibody Specificity. The specificity of the 
antibodies is demonstrated by the following results from 
Western blot analyses of different plant tissue extracts (Fig. 1). 
The anti-a-TIP protein antiserum (Fig. \A) identified a band 
of the appropriate 26-kDa size in only the pea root tip extract 
(lane 1), and a similar pattern was obtained with our anti-a- 
nP peptide antibodies (Fig. LB). The VM23 an ti- y-TIP pro- 
tein antibodies identified appropriate bands in both pea root 
tip and radish root (lane 2) (Fig. ID), a pattern that was 
indistinguishable from that obtained with our anti-y-TIP pep- 
tide antibodies (Fig. IE). Therefore, results with the anti- 
peptide antibodies were indistinguishable from results in which 
antibodies that detect the entire a- and y-TIP proteins were 
used. In immunofluorescence studies with pea and barley root 
tip cells (2), the anti-o- and anti-7-TIP antibodies label sepa- 
rate a-TlP- and y-TIP-containing vacuoles, respectively 
(G.-YJ. and J.C.R., unpublished data); therefore, the **26- 
kDa band detected by the anti-y-TIP antibodies in pea root tip 
extracts (Fig. 1 D and E, lane 1) is y-TTP and does not represent 
cross-reactivity with a-TIP. [The **40-kDa bands represent 
TIP dimers that form when tissue extracts are heated during 
preparation of samples for electrophoresis (6, 7). Additionally, 
the variation in size of the ^26-kDa TIP monomers is a result 
of proteolytic cleavage of the first transmembrane domain 
from the rest of the protein in certain plant tissues (18).] In 
contrast, the 5-TIP anti-peptide antibodies (Fig. 1C) identified 
specific TIP bands in pea root tips, radish rootj flower petals 
from petunia (lane 3) and tobacco (lane 4), and potato tubers 
(lane 5). Neither a-TIP nor y-TIP were detected in the extracts 
from flower petals and potato tuber. These findings led us to 
ask which functions might be associated with S-TIP in potato 
tuber or flower petal vacuoles. 

Many plant species accumulate VSPs (19, 20). VSPs have 
physical characteristics that differ substantially from seed-type 
storage proteins present in PSVs and include proteins with 
diverse functions such as protease inhibitors, acid phospha- 
tases, and lipoxygenases. VSPs may accumulate during vege- 
tative growth, serving as temporary sites of carbon and nitro- 
gen storage, and are mobilized as an energy source during seed 
development (19). In other circumstances, VSPs such as 



protease inhibitors accumulate in response to pathogen attack 
or wounding (9). Protease inhibitors are abundant in potato 
tubers, but initially are stored in leaves in plantlets until 
rhizomes form; then the inhibitors disappear from leaves and 
begin to accumulate in the developing rubers (21). 

Immunofluorescence Localization. We used immunofluo- 
rescence on paraffin-embedded tissue sections to assess local- 
ization of VSPs and their association with 5-TIP in several 
different tissues of potato, tomato, petunia, and soybean plants 
by using specific antibodies. In potato tuber (Fig. 2), strong 
labeling with the an ti- 5-TIP antibodies (red) was present on 
central vacuole tonoplast in most cells (open arrow), and the 
same vacuoles were strongly labeled with antibodies to potato 
protease inhibitor I (21) (Inh I, green). Occasional cells had 
two distinct vacuoles: in addition to large vacuoles containing 
5-TIP and Inh I (asterisk) there were smaller vacuoles (solid 
arrow) that did not label with either antibody. Additionally, in 
these cells, both 5-TIP and Inh I appeared to be present in 
numerous small organelles within the cytoplasm. 

Proteinase inhibitors in tomato leaves are induced by 
wounding or by methyl jasmonate treatment (10). We used 
antibodies to tomato proteinase inhibitor II (Inh II) (11) to 
compare its localization to that of 5-TIP in sections of leaves 
from plants treated with methyljasmonate for 2 days to induce 
accumulation of Inh II (Fig. 3 a and b). On Western blots, 
tomato leaf extracts have no detectable a- or y-TIP, but have 
abundant 5-TIP (data not presented). Both Inh II (Fig. 3, 
green) and S-TIP (red) antibodies strongly labeled epidermal 
cells (Fig. 3a), bundle sheath cells (not shown), and cells (Fig. 
3b) with a tissue distribution typical of paraveinal mesophyll 
(PVM) as characterized in legumes. The two antibodies colo- 
calized in small vacuoles (solid arrows) that were separate from 
the central vacuole in the cells; neither antigen appeared to be . 
associated with the central vacuole. Controls used in all 
experiments support the specificity of patterns shown: (1) 
single labeling with each of the antibodies gave the same cell 
and organelle staining pattern (data not presented); and (11) 
when sections labeled with the 5-TIP antibodies were incu- 
bated with the excess of rhodamine-conjugated anti-rabbit IgG 
F(ab')2 secondary antibodies, washed, and then incubated with 
the FITC-conjugated anti-rabbit IgG secondary antibodies, no 
green labeling was obtained- (data not presented). This dem- 
onstrates that the F(ab')2 antibodies completely blocked all of 
the 5-TIP antibody sites so that FTTC labeling in the sections 
shown in Fig. 3 must be a result of the presence of anti- 
proteinase inhibitor II antibodies at those sites. 




Fig. 1. Western blot analyses of plant tissue extracts with different anti-TIP antibodies. Membranes carrying 300 jig of protein from extracts 
of pea root tips (1), radish root (2), petunia flower petals (3), tobacco flower petals (4), and potato tuber (5) were tested with antibodies to a-TIP 
protein (^4) (8) (1:1,000), a-TIP C-terminal peptide (B) (0.2 /tg/inl), 5-TIP peptide (C) (0.95 /ig/ml), y-TIP protein (D) [VM23 antiserum (7), 
1:1,000], and y-TIP peptide (E) (0.24 /xg/ml). M, molecular mass markers; sizes (in kDa) are indicated to the left. 
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Fig. 2. Immunofluorescence on potato tuber. A differential in- 
terference contrast (DIC) image of the potato tuber section is shown 
at the upper right. The double-label image (Merge) is in the upper left, 
with separated anti-5-TIP (used at 19 fig/ml) (red) and anti-protease 
inhibitor I (Inh I) (1:100) (green) images at the bottom. Open arrow, 
centra] vacuole tonoplast staining with anti-S-TIP, where Inh I is 
closely adherent; solid arrow, vacuole with tonoplast not staining with 
anti-6-TIP and not containing Inh I in a cell with a larger vacuole 
(asterisk) labeled with the two antibodies. (Bar = 50 /xm.) 

We found that tomato flower petals contain high concen- 
trations of protease inhibitors I and II, 94 and 524 //g/ml of 
tissue extract, respectively. When sections of tomato flower 
petals were probed with antibodies against S-TIP and Inh II, 
anti-S-TIP (Fig. 3c, red) strongly labeled the tonoplast of the 
central vacuole (solid arrow) in all epidermal cells; double- 
labeling demonstrated that Inh II (Fig. 3c, green) was present 
in the contents of these vacuoles (open arrow). Anti-S-HP 
antibodies also labeled the central Vacuole tonoplast of epi- 
dermal cells in petunia flower petals (Fig. 4a). As shown in Fig. 
46, pigment is present in the central vacuole of the upper 
epidermal cells. 

To test further the apparent association of S-TIP and 
vacuoles containing VSPs, we utilized a soybean system in 
which deposition of VSPs in vacuoles of epidermal, bundle 
sheath, and PVM cells is induced by continuous depodding to 
remove the major nitrogen sink (13, 19, 22). Control plants 
were grown under identical conditions but were allowed to 
develop pods (13). Western blots demonstrated y- and 5-TIP 
in leaf extracts of both control and depodded plants (data not 
presented). Double-label immunofluorescence of leaf sections 
with the two antibodies demonstrated that y-TIP (Fig. 5a, red) 
labeled central vacuole tonoplast of epidermal (arrow), bundle 
sheath, and PVM (white dot) cells with no difference between 
control (Fig. 5a Left) and depodded (Fig. 5a Right) plants. In 
contrast, the pattern of S-TIP labeling changed substantially in 
leaves from control versus depodded plants. In control (green, 
Fig. 5a Left), little labeling of epidermal (arrow) or PVM 




Fig. 3. Immunofluorescence on tomato tissues. Anti-5-TIP (5) is 
in red and anti-Inh II (used at 1:100) is in green. M, double-label 
image, (a) Epidermal cell from leaf section. Solid arrow indicates small 
vacuole stained with both antibodies; open arrow indicates central 
vacuole, (b) PVM cell. Solid arrow as for a; v, central vacuole, (c) 
Epidermal cell from bottom surface of flower petal. Solid arrow 
indicates central vacuole tonoplast stained with anti-6-TIP; open arrow 
indicates aggregated Inh II protein within lumen of central vacuole. 
(Bars = 25 pro.) 

(white dot) was observed, while prominent labeling of those 
cells was observed in depodded plants (Fig. 5a Right), This is 
a specific difference because the maximum intensity of labeling 
of cells in the palisade layer (P) was similar in both control and 
depodded samples. The reproducibility of these observations 
was tested in a blinded manner by providing unlabeled slides 
with anti-y- and anti-S-TIP double-labeled sections to three 
independent observers; each observer accurately identified 
control and depodded samples. In Fig. 5b, the difference in 
S-TIP abundance is compared in sections labeled with only that 
antibody. Spongy mesophyll cells stained with similar intensity 
in leaves from both controls and depodded plants. Clearly, 
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Control 



Depod 



Fig. 4. Immunofluorescence on petunia petal, (a) DIC (Left) and 
anti-S-TIP labeling (Right). Upper and lower epidermal cells are shown 
at top and bottom, respectively, (b) Living cell from upper epidermis 
of petal. DIC image (Left) and endogenous fluorescence from vacuole 
contents (Right). Solid arrow, limit of vacuole; open arrow, cell wall; 
n = nucleus. (Bars — 25 /im.) 

however, there was abundant staining of the tonoplast in PVM 
(dot) and epidermal (asterisk) in leaves from depodded plants 
while very little staining of these cells was evident in the 
control plants. When similar sections were double-labeled with 
anti-S-TIP antibodies and antibodies to soybean VSPs, the 
VSPs were present in PVM and epidermal cells (13, 15) whose 
vacuoles now carried S-ITP in their tonoplast (data not pre- 
sented). Thus, induction of VSP storage in soybean leaf cell 
vacuoles was accompanied by induction of S-TIP expression in 
tonoplast of the same vacuoles while expression of y-TIP was 
unchanged. 



DISCUSSION 

These data demonstrate that vacuoles whose tonoplast is 
labeled with anti-5-TIP antibodies are used by plant cells to 
store pigments and VSPs, proteins synthesized in response to 
developmental and environmental cues. Central vacuoles in 
petunia and tomato flower petal epidermal cells contain 
pigments and protease inhibitors. Tonoplast in most central 
vacuoles in potato tubers labeled strongly with anti-S-TIP 
antibodies, and only those vacuoles contained Inn I. Vacuoles 
in tomato leaf epidermal, bundle sheath, and PVM cells 
labeled with anti-S-TIP antibodies and contained Inh II; these 
vacuoles were smaller and separate from central vacuoles in 
the same cells. In soybean leaves, central vacuoles marked by 
7-TIP in their tonoplast acquired abundant S-TTP in the same 
tonoplast when storage of VSPs in the vacuoles was induced by 
depodding. This finding demonstrates a tight coupling between 
the presence of S-TIP and acquisition of a specialized vacuolar 
storage function by vacuoles that otherwise would have been 
assumed to have a lytic function given the presence of y-TIP 
in their membranes. These results in aggregate define "delta 
vacuoles" (AVs) with S-TIP in their tonoplast as specialized 
storage organelles that are structurally and functionally dis- 
tinct from PSVs and LVs. 



Fig. 5. Immunofluorescence on soybean leaf, (a) Sections from 
leaves harvested at 4 weeks from control (Left) and continuously 
depodded (Right) plants are shown. Indicated. are the palisade cell (P) 
and spongy mesophyll cell (S) layers; the white dot indicates a PVM 
cell, while the arrow indicates an epidermal cell on the lower surface 
of the leaf. Anti-S-TIP is in green, and anti-7-TIP (used at 8.8 jig/ml) 
is in red, with the double-label (Merge) image at the top. (Bar = 50 
jxm.) (b) Sections from leaves of control and depodded plants as in a. 
(Upper) DIC image. (Lower) Single label with anti-6-TIP. *, Epidermal 
cell; PVM cell. 

The pattern of labeling observed with anti-S-TIP antibodies 
was unusual. Frequently large and numerous aggregates were 
present along the tonoplast and within the vacuole lumen (e.g., 
Fig. 2). In leaves from depodded soybean plants, these S-TIP- 
containing aggregates frequently appeared almost to fill the 
vacuole lumen (e.g., Fig. 5b, cells adjacent to asterisk). This 
appearance indicates that formation of AVs may involve more 
than insertion of S-TIP molecules into preexisting tonoplast. 
Perhaps these aggregates represent newly synthesized mem- 
brane with high concentrations of S-TIP that accumulates as 
vesicles or strands within the vacuole lumen. It is likely that the 
aggregates provide a clue as to how the internal vacuole 
environment is changed to permit VSP accumulation. 
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Mechanisms for sorting soluble proteins to AVs have not 
been elucidated. The finding, however, that soybean vacuoles 
carrying the LV marker 7-TIP may be converted to AVs 
indicates that traffic of soluble proteins to at least some AVs 
probably involves a vacuolar-sorting receptor associated with 
the LV pathway (23, 24). This would explain why the vacuolar- 
sorting determinant of prosporamin (25), a sweet potato tuber 
protease inhibitor (26), interacts with that receptor (23, 27). 
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ABSTRACT Two sets of synthetic oligonucleotides coding 
for amino adds in the amino- and caitoxyl-tennlnal portions 
of wheat germ agglutinin were synthesized and used as hybrid- 
ization probes to screen cDNA libraries derived from develop- 
ing embryos of tetraploid wheal. The nucleotide sequence for 
a cDNA done recovered from the cDNA library was deter- 
mined by dideoxynucteotide diafo sequencing in 
vector M13. The ammaacU sleqiumce deduced from the DNA 
sequence indlc^Ud tbakt this cDNA done (pNVRl) encodes 
'l isolectin 3 of wheat germ agglutinin. Comparison of the 
deduced amino acid sequence of done pNVRl with published 
sequences indicates isolectin 3 differs from isolectins 1 and 2 by 
10 and 8 amino add changes, respectively. In addition, the 
protein encoded by pNVRl extends 15 amino adds beyond the 
carboxyl terminus of the published amino add sequence for 
isolectins 1 and 2 and includes a potential site for N-linked 
glycosylation. Utilizing the insert of pNVRl as a hybridization 
probe, we have demonstrated that the expression of genes for 
wheat germ aggratmin is modulated by exogenous abscisic add. 
Striking homology is observed between wheat germ agglutinin 
and chitinase, both of which are proteins that bind chitin. 



Lectins, sugar-binding proteins derived mainly from plant 
sources, have been of great value as specific probes for 
investigating the distribution and function of carbohydrates 
on the surfaces of animal cells (1, 2). In recent years, 
however, the notion has become widely accepted that the 
ability of lectins to distinguish discrete sugars did not arise 
fortuitously during evolution (2), and as a result, there has 
been increased interest in the synthesis and biochemistry of 
this group of proteins. Wheat germ agglutinin (WGA), the 
first cereal lectin characterized in detail, binds specifically to 
the sugar A^-acetylglucosamine and to chitin, a polymer of 
N-acerylglucosamine residues (3, 4). In the hexaploid wheat 
Triticum aestivum y WGA exists as three closely related 
isolectins derived from the A, B, and D genomes (5, 6). 
Comparison of the amino acid sequences for isolectin 1 (A 
genome) and isolectin 2 (D genome) indicates that these 
proteins differ at four residues (7, 8). The amino acid 
sequence for isolectin 3 (B genome), the least abundant form, 
is not yet available. These three isolectins randomly associate 
into functional dimers in vivo (5) and are immunologically 
indistinguishable (9). 

In wheat plants, WGA is found in the embryos and 
adventitious roots (9-11). During embryogenesis, WGA 
expression is under temporal control (12). Accumulation of 
WGA is tissue-specific and cell-type specific in various 
organs of the embryo (e.g., coleoptile, coleorhiza, and 
radicle) (9, 10). In other species of Gramineae, a lectin 
immunologically related to WGA is synthesized during seed 
development and in the roots of adult plants (13, 14). 
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Furthermore, the accumulation of lectin is modulated by the 
hormone abscisic acid (12, 15). Although biochemical, im- 
munological, and microscopic studies have helped to char- 
acterize the composition and distribution of WGA (3-8, 10, 
11), the genes for WGA have not been isolated. 

We are interested in investigating the molecular mecha- 
nisms that regulate the developmental tissue-specific expres- 
sion of WGA genes. To isolate clones for WGA, cDNA 
libraries from developing grains of the tetraploid wheat 
Triticum durum (AABB) were used. Here, we report the 
isolation and the nucleotide sequence of a cDNA clone that 
we presume to encode isolectin 3.* Using this clone as a 
hybridization probe, we present evidence that the expression 
of WGA genes is modulated by abscisic acid. Because of the 
common ability of WGA and chitinase to bind chitin, we 
searched for amino acid homology using the recently pub- 
lished sequence for chitinase from Phaseolus vulgaris (16). 
We found strong homology between the amino terminus of 
chitinase and four regions of WGA. The significance of this 
similarity is addressed. 

MATERIALS AND METHODS 
Plant Material. Wheat T. aestivum L. (AABBDD) cv. 
Marshall was obtained from the Minnesota Crop Improvement 
Association (St. Paul, MN). Plants were grown as previously 
described (11), and embryos were collected at 20 days after 
bloom (anthesis) according to Raikhd and Quatrano (12). 
Abscisic acid treatment involved culturing isolated embryos in 
the dark at 27°C for 3 days on filter paper containing growth 
medium (15) with and without 10~ 4 M abscisic acid (Sigma). 

Materials. Two cDNA libraries, derived from mRNA 
isolated from developing wheat grains of T. durum (AABB) 
cv. Mexicali at 3 and 4 weeks post-anthesis, were provided 
by C. Brinegar (ARCO Plant Cell Research Institute, Dublin, 
CA). Two sets of degenerate synthetic oligonucleotides were 
prepared for amino acid regions in isolectin 1 (8): for the 
sequence Asn-Met-Glu-Cys-Pro-Asn-Asn in the amino-ter- 
minal region (residues 9-15), probe 2, TTR TAC CTY ACR 
GGN TTR TT; and for the sequence Cys-Thr-Asn-Asn-Tyr- 
Cys-Cys in the carb oxyl terminal region (residues 141-147), 
probe 1, ACR TGN TTR TTR ATR ACR AC. The oligonu- 
cleotide mixtures were synthesized on an Applied Biosys- 
tems (Foster City, CA) 380 DNA synthesizer by a solid-phase 
method (17) and separated by electrophoresis on a 20% 
polyacrylamide gel containing 8 M urea in TBE, pH 8.3 (0.89 
M Tris/0.089 M boric acid/2.7 mM EDTA). The oligonucle- 
otides were eluted in 0.5 M ammonium acetate/10 mM 
magnesium acetate/0.1% NaDodS0 4 , and then end-labeled 
with 32 P using T4 polynucleotide kinase (18). 



Abbreviation: WGA, wheat germ agglutinin. 

*This sequence of isolectin 3 of wheat germ agglutinin is being 
deposited in the EMBL/GenBank data base (Bolt, Beranek, and 
Newman Laboratories, Cambridge, MA, and Eur. Mol. Biol. Lab., 
Heidelberg) (accession no. J02961). 
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Isolation and Screening of cDNA Clones. The cDNA librar- 
ies, in Escherichia coti strain DH5or (19), were plated directly 
onto nitrocellulose filters laid on agar plates containing Luria 
broth medium with ampicillin at 50 /ig/ml (20). After colonies 
were established , the bacteria were lysed, and the filters were 
probed with oligonucleotide probes 1 and 2 as follows. The 
temperature of hybridization (Th) for each oligonucleotide 
was calculated using the formula T H = T D - 12°C, where T D 
= 2°C x (the number of A-T base pairs) plus 4°C x (the 
number of CrC base pairs). Replicate filters were prehybrid- 
ized in 6x SSC (0.9 M sodium chloride and 0.09 M sodium 
citrate, pH 7.0) (lx SSC = 0.15 M sodium chloride/0.015 M 
sodium citrate, pH 7) plus 0.25% nonfat milk, and hybridized 
in the same buffer containing the labeled oligonucleotide 
probes and 0.1% NaDodSO* at 36°C (probe 1) or 38°C (probe 
2). After hybridization, filters were washed three times in 6x 
SSC/0.25% nonfat milk/0.1% NaDodS0 4 at room tempera- 
ture for 10 min, followed by a 2-min wash at 46°C (probe 1) 
or at 48°C (probe 2). Filters were dried and autoradiography 
for 16-18 nr. Colonies that produced positive signals were 
selected and rescreened using the same probes under the 
same conditions. 

DNA Sequence Determination. Inserts from recombinant 
plasmids were purified by electrophoresis in low-melting- 
point agarose. Excised cDNA inserts or appropriate restric- 
tion fragments were then subcloned into M13mpl8 or 
M13mpl9. Dideoxynucleotide chain-termination sequencing 
from single-stranded M13 templates was accomplished using 
a Bethesda Research Laboratories M13 sequencing kit with 
the exception that dGTP was replaced by 7-deaza-2'-deoxy- 
guanosine triphosphate (Boehringer Mannheim). 

RNA Blot Analysis. Total RNA was isolated as described 
(21) and poly(A) + RNA was purified by chromatography on 
oligo(dT>cellulose (18). Poly(A) + RNA was electrophoresed 
in adjacent lanes (1 fig per lane) on 2% agarose gels containing 
6% formaldehyde and then transferred to nitrocellulose (22). 
Filters were hybridized with inserts labeled by the random 
primer method of Feinberg and Vogelstein (23) and washed 
under stringent conditions as described in Thomas (22). 

RESULTS 

Isolation of cDNA Clones. Two synthetic oligonucleotides, 
each consisting of 20 nucleotides complementary to the 5' and 
3' ends of the coding portion of isolectin 1 mRNA (8), were used 
for isolation of cDNA clones specific for WGA. These two 
sequences corresponded to amino acids 9-15 (probe 2) and 
141-147 (probe 1). Because of the degeneracy of the sequences 
involved, probe 2 was a mixture of 64 sequences, and probe 1 
was a mixture of 128 sequences. One cDNA clone, pNVRl [1.0 
kilobase (kb)L was selected by hybridization to both probes on 
the assumption that this insert contains sequences spanning the 
coding region delimited by the oligonucleotide probes. A 
second clone, pNVR2 (0.7 kb), was recognized by probe 1 only 
and is presumably truncated at the 5' end. The restriction map 
and partial sequence of pNVR2 indicate that it is a shorter 
version of pNVRl. When the insert from clone pNVRl was 
labeled by the random primer method (23) and used as probe to 
rescreen the cDNA libraries, no additional cDNA clones were 
retrieved. 

Nucleotide Sequence. The cDNA insert of pNVRl was sub- 
cloned into M13mpl8 and M13mpl9 according to the strategy 
shown in Fig. 1, and its nucleotide sequence was determined as 
described. The nucleotide sequence of the cDNA clone and the 
deduced amino acid sequence are shown in Fig. 2. Clone 
pNVRl contains a 558-oucleoude open reading frame encoding 
a 186-amino acid polypeptide rich in cysteine and glycine but 
lacking an ATG start codon at the 5' end. Protein sequence 
analysis indicates that the amino terminus of WGA is blocked 
(7* 8) so that the first residue (glutamine) of the published 
sequence may not be the amino terminus of mature WGA. It is, 
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Fig. 1. Restriction map and sequencing strategy for WGA cDNA 
clone pNVRl. Open bar, cloned cDNA; arrows, length and direction 
of the sequenced restriction fragments. Scale of the map is in kb 
pairs. 

therefore, presumably fortuitous that the cDNA clone pNVRl 
and the published amino acid sequence for WGA initiate with 
the same amino acid. The hydropathy plot (24) of the polypep- 
tide encoded by clone pNVRl shows the polypeptide to be 
comprised mostly of hydrophilic amino acids (Fig. 3). The 
polypeptide encoded by pNVRl extends 15 amino acids beyond 
the carboxyl terminus of the amino acid sequence published for 
isolectins 1 and 2 (Fig. 2, squares) (7, 8). The carboxyl-tenninal 
segment contains the most hydrophobic portion of the entire 
protein (Fig. 3). A potential site for N-linked glycosylation 
occurs at residues 180-182 (Asn-Ser-Thr) (Fig. 2, dots above 
squares). The 3 '-untranslated region contains four in-frame 
termination codons (TGA, TGA, TAA, and TAG, underlined in 
Fig. 2) and a potential polyadenylylation signal (AATAAT, 
double-underlined in Fig. 2), and terminates in a poly(A) tail that 
begins 229 nucleotides downstream from that signal. 

Comparison with Published Sequences of Isolectin 1 and 2. 
The amino acid sequence deduced from the cDNA nucleotide 
sequence (Fig. 2) was compared with published protein 
sequence data. Re-evaluation of the discrepancies at posi- 
tions 134 and 150 (Fig. 2, arrows) has indicated a low yield of 
lysine in addition to glycine for residue 134 (C. Wright, 
personal communication) and has confirmed the presence of 
tryptophan at residue 150 (25). The deduced amino acid 
sequence of pNVRl was found to differ from the published 
sequence of isolectin 1 (8) at 10 positions and isolectin 2 at 8 
positions (7) (Table 1). 

RNA Blot Analysis. Embryos isolated from hexaploid 
wheat at 20 days post-anthesis were cultured in the presence 
and absence of abscisic acid (Fig. 4). Equal amounts of 
poly(A) + RNA from the embryos were fractionated by 
agarose-formaldehyde gel electrophoresis and transferred to 
nitrocellulose filters. A 1.1 -kb mRNA was detected (Fig. 4) 
after hybridization with pN VR1 insert labeled by the random 
primer method (23). The autoradiograph showed that the 
level of RNA in abscisic acid-treated embryos was several 
times higher than the level in embryos cultured in the absence 
of abscisic acid. 

Nucleotide and Amino Add Homology Between WGA and 
Chitinase. The deduced amino acid sequence of cDNA clone 
pNVRl was used to search for homology with chitinase, an 
enzyme that catalyzes the hydrolysis of l,4-£ linkages of 
N-acetylglucosamine polymers in chitin. The amino acid 
homology matrix between clone pNVRl and chitinase from 
P. vulgaris is shown in Fig. 5. This matrix was generated 
using the analysis program of Pustell and Kafatos (26) with 
parameters set so that each letter within the matrix represents 
a match of 50% or greater over a span of 21 amino acids. 
Extensive homology between the amino terminus of chitinase 
and four regions of WGA is apparent. 

DISCUSSION 
In this paper we present the amino acid sequence of WGA as 
deduced from a cDNA clone designated pNVRl. That this 
clone encodes WGA has been verified by comparison of the 
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Asn 20 
gin org cya gly glu gin gly ler gly net glu cys pro asn «*n T*u cy» cys s*r gin tyr gly tyr cy» gly met 

CAA ACC TCC CCC CAC CAC CCC ACC GCC ATC CAC TCC CCC AAC AAC CTC TCC TCC AGC CAC TAC GCC TAC TCC CCC ATC 

Asp tO 

gly gly asp tyr cys gly lys gly cys gin asn gly ala cys trp thr sar lys arg cya gly ser gin ala gly gly 
CCC CCC GAT TAC TQC CGC AAC CCC TCC CAC AAC CCC GOC TCC TCC ACC AGC AAC CCC TCT CCC ACC CAC CCC CCC GCC 
Ala * * 60 * 

lya thr cys pro asn asn Ms cys cys ser gin tyr gly his cys gly phe gly ala glu tyr cya gly ala gly cya 

AAC ACC TCC CCC AAC AAC CAC TCC TCC ACC CAC TAC CCC CAC TCC CCC TTC CCC CCC CAC TAC TCC CCC CCC CCC TCC 

80 Ser 100 

gin gly gly pro cys arg ala asp lie lys cys gly ser gin ala gly gly lys leu cya pro asn asn leu cys cya 

CAC GCC GCC CCC TCC-CCC CCC CAC ATC AAC TCC CCC ACC CAC CCC CCC CCC AAC CTC TCC CCC AAC AAC CTC TCC TCC 

Ser Cly 120 Ser 

ser gin trp gly tyr cys gly leu gly ser glu phe cys gly glu gly cys gin asn gly ala cya ser thr asp lys 

ACC CAC TCC CCC TAC TCC CGC CTC CCT TCC CAC TTC TCC CCC GAG CCC TCC CAC AAC CCC OCT TCC ACC ACC CAC AAC 

pro cys gly lys asp ala gly gly arg val cys thr asn asn tyr cys cys ser lys trp gly ser cys gly lie gly 

CCC TCT GCC AAC CAC CCC CGC CCC AGC CTT TCC ACT AAC AAC TAC TCC TCT AGC AAC TCC CCA TCC TCT CGC ATC CCT 

160 Ala 
pro gly tyr cys gly ala gly cys gin ser gly gly cys asp gly iSl phe afa gTu aTa fle aTa tSr *"n ser Sir 

CCC CCC TAC TCC CCT CCA GCC TCC CAC AGC CCC GCC TCC CAT CCT CTC TTC CCC GAG GCC ATC CCC ACC AAC TCC ACT 

■ ■ ■ ■ 

leu leu ala glu end 

CTT CTC CCA CAA TCA TCA TCTTCCJAATGCTACTATTCCAACCACCAATAATCCCT 
TTTTACTACTACTACTT AAJAA TTCTCTACCTTCCMTATC^ 
ACACAACTGnGTCTCCCAATATACACTGTACTATAOT 
CTTCCACTACTTCCTGATATC£TTGCAATATATC 

Fig. 2. Nucleotide sequence of WGA cDNA clone pNVRl. The deduced amino acid residues are shown above the nucleotide triplets. The 
differences between the deduced amino acid sequence and the published amino acid sequence of isolectin 2 are designated by the amino acid 
codes above the deduced sequence. The additional differences with isolectin 1 are designated by asterisks. Proline at position 56 is substituted 
with threonine, and the histidines at positions 59 and 66 are substituted with ghitamine and tyrosine, respectively. Previously described 
differences at positions 150 (25) and 134 (arrows) have been resolved (C. Wright, personal communication). Four termination codons (single 
underline) and a putative polyadenylylation signal (double underline) arc indicated. Fifteen amino acids that extend beyond the carboxyl terminus 
of the published sequence for WGA are designated by squares. A potential glycosylation site is indicated by dots above the squares. 



deduced amino acid sequence with the sequence determined 
by direct amino acid sequencing of the purified protein (7, 8). 
The length of the polypeptide derived from the deduced 
amino acid sequence is 186 amino acids, and the calculated 
Af r is 18,754. Nevertheless, pNVRl does not represent the 
complete coding sequence for WGA. First, the initiating 
methionine codon is absent from the cDNA. Second, because 
WGA is synthesized on and translocated across the rough 
endoplasmic membrane (27), an amino-terminal signal pep- 
tide would be expected (28). Third, there may be one or more 
amino acids at the amino terminus that have not been 
detected by peptide sequencing because of blockage of the 
amino terminus (7, 8). The size of the mRNA recognized by 
clone pNVRl predicts a full-length cDNA of 1.1 kb. 



The DNA sequence of pNVRl encodes a protein that 
extends 15 amino acids beyond the carboxyl terminus of the 
published amino acid sequence and includes a potential site 
for N-linked glycosylation. Mature WGA is not a glycopro- 
tein, but its precursor form is glycosylated (27). The site of 
glycosylation probably lies in the 15 amino acid carboxyl- 
terminal sequence because the only possible site for 
glycosylation resides in this region. Hie glycosylated precur- 
sor is known to be processed (27) and to accumulate in 
protein bodies or vacuoles (10, 29). The WGA precursor in 
the endoplasmic reticulum-associated fraction is 5 kDa larger 
than the mature WGA (27). The difference in molecular mass 
between the precursor and mature WGA may be a conse- 
quence of the extra 15 amino acids and glycosylation of the 
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Fio. 3. Hydropathy plot of the protein encoded by cDNA pNVRl. Ordinate, hydropathic index (24); abscissa, amino acid position. The 
additional 15 amino acids at the carboxyl terminus are right of the broken line. 
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Table 1. Amino acids at positions in which there are differences 
between the residues of isolecrrns 1 and 2 and the protein 
encoded by pNVRl 
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Amino acid 


Isolectin 1 


Isolectin 2 


pNVRl 


56 


Thr 


Pro 


Pro 


59 


Gin 


His 


His 


66 


Tyr 


His 


His 


93 


Ala 


Ser 


Ala 


9 


Asn 


Asn 


Gly 


37 


Asp 


Asp 


Asn 


53 


Ala 


Ala 


Lys 


109 


Ser 


Ser 


Tyr 


119 


Gly 


Gly 


Glu 


123 


Ser 


Ser 


Asn 


171 


Ala 


Ala 


Gly 



carboxyl terminus. The hydropathy plot of the amino acid 
sequence derived from pNVRl clearly indicates that the 
carboxyl terminus of the cloned WGA sequence consists of 
hydrophobic amino acids, which is consistent with the 
possibility that it is removed post-translationally. Removal of 
carboxyl-terminal residues was seen during maturation of 
napin, a rapeseed storage protein (30). It was recently shown 
that the lectin concanavalin A (Con A), which is not a 
glycoprotein, is synthesized as a glycosylated precursor (31). 
Normal transport of this protein is dependent on the presence 
of the glycan (32). It is interesting that WGA precursor is a 
biologically active lectin (27), whereas precursor for Con A 
does not have lectin activity (31). In other words, the loss of 
the pro-WGA carboxyl-terminal domain does not relate to its 
ability to bind A^-acety lglucosamine . Alternatively, cleavage 
of the carboxyl terminus may occur during the purification of 
WGA such that the mature protein actually contains 186 
amino acids in vivo. 

Clone pNVRl mRNA contains four termination codons 
and a 3 '-untranslated region. A potential polyadenylylation 
signal (AAUAAU) is found in the noncoding region followed 
by a poly(A) tail. Whereas the consensus sequence for the 
polyadenylylation signal is very highly conserved in animal 
systems (AAUAAA), plant mRNAs frequently deviate from 
this theme (33). The deduced amino acid sequence confirms 
extensive interdomain homology. The 7-amino acid sequence 
Gly-Cys-Gln-Asn-Gly-Ala-Cys is found at residues 34-40 
and again at residues 120-126. Short repeated stretches of 




Fig. 4. RNA blot analysis of WGA mRNA levels. Poly(A) + RNA 
(1 Mg)> isolated from embryos excised at 20-day post-anthesis and 
cultured in the presence (lane 1) and absence (lane 2) of abscisic acid, 
was separated on a 2% agarose, 6% formaldehyde gel. After 
transferring the RNA to nitrocellulose, the niter was hybridized to a 
32 P-jabeled DNA insert from pNVRl under stringent conditions. 
Positions of DNA M r markers were obtained from the ethidium 
bromide-stained portion of the gel. 

Tyr-Cys-Gly, Ala-Gly-Gly, Gly-Cys-Gln, Cys-Cys-Ser, or 
Cys-Gly-Gly are found throughout the polypeptide. 

Amino acid sequence studies on wheat isolectins 1 (A 
genome) and 2 (D genome) (5-8) indicate that they differ 
distinctly in their histidine content: two histidines in isolectin 
2 and no histidine in isolectin 1 (8). Because clone pNVRl 
was isolated from a cDNA library derived from the tetraploid 
wheat T. durum (AABB), the cDNA clone cannot encode 
isolectin 2 derived from the D genome. Furthermore, pNVRl 
probably does not encode isolectin 1 from the A genome. 
Isolectin 1 does not contain any histidine, whereas pNVRl 
encodes a protein containing two histidine residues. Thus, 
pNVRl probably represents isolectin 3 derived from the B 
genome. Although the amino acid compositions of isolectins 
2 and 3 are very similar, eight discrete differences were 
identified between them. At least four of these differences 
(residues 9, 53, 93, and 119) are authentic. The x-ray 
crystaUographic data for these four positions in isolectin 2 are 
definitive, and there is no evidence for heterogeneity in 
peptide preparations (7). The discrepancies at the remaining 
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Fig. 5. Amino-acid homology matrix of WGA (x axis) and ctuunase from P. vulgaris (y axis). Homology matrices were plotted with the Pustefl 
id Kafatos sequence analysis program (26) using the following parameters: range = 10. scale factor = 0.75. ininimum value - 50, compression 

1. Each letter represents homology at that point in the matrix where A = 100%, B - 98% Z = 50%. Only homologies in the first 40 

acids of cnitinase are plotted; the remainder of the protein shows no homology with WGA. 
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four positions (37, 109, 123, and 171) between the deduced 
amino acids and isolectin 2 could be because of inaccuracies 
resulting from cross-contamination of the isolectins during 
fractionation. 

Abscisic acid treatment of developing wheat embryos has 
been shown to affect temporal expression of WGA (12, 15). 
Using clone pNVRl as a hybridization probe, we found that 
abscisic acid treatment of excised wheat embryos modulates 
mRNA levels for WGA, which is consistent with known 
effects of abscisic acid on lectin levels (15). Similar results 
were reported by Williamson et at. (21) for the abundant 
embryo storage protein. It is possible that abscisic acid 
regulation is based upon changes in the rates of mRNA 
transcription, turnover, or processing. It also needs to be 
mentioned that clone pNVRl may be hybridizing to the 
mRNAs for isolectins 1 and 2, as well as to the mRNA for 
isolectin 3 on the RNA blot. Given the similarity of the 
isolectin sequences, the high-stringency conditions used for 
hybridization may not have prevented cross-hybridization 
with mRNAs from related isolectins. 

WGA and chitinase are two chitin-binding proteins that are 
thought to have antimicrobial activity (34). Recently, how- 
ever, evidence was presented to show that antifungal activity 
of WGA can result from contamination by chitinase (35). 
Comparison of amino acid sequences demonstrated a striking 
homology between the amino terminus of chitinase (16) and 
four regions of the WGA molecule. The amino acid residues 
of WGA directly involved in primary sugar-binding sites are 
tyrosine-73, serine-62, and glutamic acid-115 (36). These 
three residues are found in the regions of homology between 
chitinase and WGA. One may speculate that these regions of 
homology account for the similarity in chitin-binding activity 
of these proteins and, subsequently, in copurificauon. Ad- 
ditionally, the sequence homology between WGA and chi- 
tinase may be of functional significance. 
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Different Legumin Protein Domains Act as Vacuolar 
Targeting Signals 

Gerhard Saalbach, 1 Rudolf Jung, Gotthard Kunze, Isolde Saalbach, Klaus Adler, and Klaus Muntz 

Institute of Genetics and Crop Plant Research, Corrensstrasse 3, 0-4325 Gatersleben, Sachsen-AnhaK, 
Federal Republic of Germany 

Legumin subunits are synthesized as precursor polypeptides and are transported into protein storage vacuoles in 
field bean cotyledons. We expressed a legumin subunit in yeast and found that in these cells it is also transported 
into the vacuoles. To elucidate vacuolar targeting information, we constructed gene fusions of different legumin 
propolypeptide segments with either yeast invertase or chloramphenicol acetyltransferase as reporters for analysis 
in yeast or plant cells, respectively. In yeast, increasing the length of the amino-terminal segment increased the 
portion of invertase directed to the vacuole. Only the complete legumin a chain (281 amino acids) directed over 
90% to the vacuole. A short carboxy-terminal legumin segment (76 amino acids) fused to the carboxy terminus of 
invertase also efficiently targeted this fusion product to yeast vacuoles. With amino-terminal legumin-chloramphen- 
icol acetyltransferase fusions expressed in tobacco seeds, efficient vacuolar targeting was obtained only with the 
complete a chain. We conclude that legumin contains multiple targeting information, probably formed by higher 
structures of relatively long peptide sequences. 



INTRODUCTION 

The secretory system is used to transport proteins via the 
endoplasmic reticulum (ER) and the Gbigi apparatus to the 
cell surface or to the vacuple/Iysosbme (Schekman, 1985; 
Jones art Robfrson^^ Chrispeete, 1991). Transloca- 
tion across -the ER membrane is mediated by an amino- 
terminal signal sequence (Perara and Lingappa, 1988). 
Secretion occurs by default, whereas positive sorting in- 
formation is required for transport to the vacuole (Kelly, 
1985; Pfeffer and Rothman, 1987; Wieland et al., 1987; 
Doreletal., 1989). 

In mammalian cells, transport to the iysosomes is me- 
diated by mannose-6-phosphate (von Rgura and HasHik, 
1986), but in yeast and plant systems, glycans do not act 
as vacuolar sorting signals (Schwaiger et al. t 1982; Voetker 
et at., 1989). In the yeast hydrolases carboxypeptidase Y 
(CPY) (Johnson et al., 1987; Vails et al.. 1987) and pro- 
teinase A (Klionsky et al.. 1988). vacuolar sorting infor- 
mation resides in short amino acid sequences at the amino 
terminus. In plants, Tague et al. (1990) found that the 43 
amino-terminal amino acids of mature phytohemagglutinin 
(PHA) from the common bean are also sufficient to sort 
invertase to the yeast vacuole. A set of four contiguous 
amino acids (QRPL) in the CPY signal was identified as 
critical for vacuolar localization (VaPs et al., 1990). For 
PHA, the sequence LQR was also found to be important 
(Tague et al., 1990). However, the proteinase A sorting 

1 To whom correspondence should be addressed. 



signal does not resemble the CPY targeting element and 
the PHA LQR sequence is only partially conserved among 
lectins. Changes in this region in native PHA do not result 
in higher levels of secretion, indicating the presence of 
multiple targeting information in PHA. 

The study of PHA-invertase fusions in Arabidopsis 
showed that the PHA segments that target invertase to 
yeast vacuoles are not sufficient for transport to plant 
vacuoles, indicating that differences exist between yeast 
and plant vacuolar sorting processes (Chrispeels, 1991). 
In a different lectin from barley, a carboxy-terminal propep- 
tide domain of 15 amino acids is necessary for targeting 
to the vacuoles of tobacco cells (Bednarek et al., 1990). 

Here we report vacuolar sorting results obtained from 
the analysis of the 1 1 S globulin legumin, the major- field 
bean seed protein that accumulates in cotyledon cell stor- 
age vacuoles. This protein contains six similar subunits, 
each composed of two disulfide-Jinked chains arising from 
a common precursor (Bassuner et al., 1983; Horstmann. 
1983). An amino-terminal signal sequence mediates the 
cotranslationai insertion into the lumen of the ER (Bassuner 
et al., 1984) and the propolypeptide is then transported by 
way of the Golgi apparatus (Zur Nieden et al., 1984) into 
the storage vacuoles. In the ER, 11S propolypeptides are 
assembled into trimers, whereas hexamers are formed 
only in the storage vacuoles (for review, see Akazawa and 
Hara-Nishimura, 1985). The 11S globulins do not contain 
any sites for AMinked glycosylation (B&umlein et at., 1986; 




Expression of Legumin in Yeast 
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Legumin Is Transpbrled to Yeast Vacuoles 
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Table 1. Invertase Activity Produced by Amino-Terminal 
LegumirHnvertase Fusions in Yeast 

Invertase Activity 8 



Fusion Total External % Secretion 



Le-tnv-sp 


350 


350 


100 


Le-lnv-7 


2500 


2500 


100 


Le-lnv-20 


1870 


1870 


100 


Le-lnv-28 


2000 


2000 


100 


Le-lnv-39 


1100 


1040 


94 


Le-lnv-50 


670 


570 


85 


Le-lnv-62 


1800 


1705 


95 


Le-lnv-86 


320 


214 


66 


U-lnv-128 


330 


220 


66 


Le-lnv-169 


705 


350 


50 


Le-lnv-281 


600 


45 


7 


Le-lnv-462 


_b 







* Invertase activity is in units (nanomoles of glucose per minute at 
30°C) per ODeoo yeast ceils. 
b Fusion unstable in pAAH5. 



128 amino acids. A segment with 169 amino acids kept 
50% intracellular, and the whole a. chain (281 amino acids) 
was necessary to retain more than 90% inside the yeast 
cells. 

Native legumin is transported to the yeast vacuoles. To 
determine whether the legumin segments also target the in- 
tracellular portions of invertase activity to the vacuoles, we 
isolated vacuoles from the strain harboring the Le-lnv-86 
fusion, retaining 33% of the activity inside the cell. Recov- 
ery of vacuoles was calculated from the a-mannosidase 
(vacuolar marker) activity in the spheroplast tysate and in 
the vacuole fraction. Contamination with ER or cytoplasma 
was low (approximately 1 0%), as determined by assaying 
the fractions for cytochrome c reductase and a-glucosi- 
dase, respectively. The recovery of intracellular invertase 
in the vacuole fraction was calculated from the activities in 
the whole cells, in the spheroplast supernatant, in the 
spheroplast lysate, and in the vacuole fraction. Figure 6 
shows that the intracellular invertase activity cofraction- 
ated with the vacuolar marker a-mannosidase, indicating 
that the legumin fusions are targeted to the vacuoles. 



A Short Carboxy-Termirtal Legumin Segment 
Efficiently Targets Invertase to Yeast Vacuoles 

The carboxy-tenminal invertase-legumin fusions were ana- 
lyzed in yeast in the same way as described for the amino- 
terminal fusions. The results are shown in Table 2. A short 
segment of 13 amino acids caused practically no reduction 
of secretion. In the case of the lnv-Le-C38 fusion, the total 
cellular invertase activity was very low, indicating a struc- 
tural interaction between a specific legumin segment and 



invertase, and 28% of that activity was retained inside the 
cells. With an additional 13 amino acids (lnv-Le-C51), 
activity was normal and only 17% was retained. Six addi- 
tional amino acids (lnv-Le-C57) caused an increase in the 
retention to 43% inside the ceils and another 6 amino acids 
(lnv-Le-C63) increased that value to only 54%. With the 
addition of the last 76 carboxy-terminal amino acids of 
legumin (for the sequence, see Figure 5) to invertase, 
practically complete (93%) retention was achieved. This 
activity was also transported efficiently to the yeast vacu- 
oles, as shown by vacuole isolation described above (Fig- 
ure 6). These results indicated that the essential sequence 
in the carboxy-terminal signal might be located in the 
amino-terminal part of the 76-amino acid segment. How- 
ever, deletion of the carboxy-terminal half (lnv-Le-C76-38) 
resulted in considerable loss of vacuolar transport to only 
50%, indicating that the complete segment is necessary 
to form the signal. 



Does the Level of Expression Influence the Secretion/ 
Retention Ratio? 

The total invertase activities expressed from the different 
gene fusions ranged from about 200 units per OD of cells 
(lnv-Le-C76) to 2500 units per OD of cells (Le-lnv-7). These 
differences could be due to the influence of the legumin 
segments on the specific activities of the fusion molecules. 
In case of CPY-invertase and PHA-invertase fusions, activ- 
ities were not as variable. Values of approximately 200 to 
300 and 350 to 750 units per OD of cells, respectively, 
were observed (Johnson et al., 1987; Tague et al„ 1990). 
About threefold enhancement of the expression of a 
PHA-lnv fusion caused an increase in secretion from 1 0% 
to 22%, probably because of the saturation of the vacuolar 
sorting machinery (Tague et al., 1 990). To analyze whether 



TSSEFDRLNO CRLDNINALE POHRVESEAQ LTETMMPNHP 40 

ELRCAGVSLI RRTIDPNQLH LP8YSPSPQL IYIIOGXGVI 80 

GLTLPGCPQT YQEPRSSQSR QQSRQQQPOS HQKIRRFRKQ 120 

DIIAIPSGIP YWTYNNOOEP LVAtSLLOTS HIAMQLDSTP 160 

RVFYLGGNPE VEFPETQEEQ QERHQQKHSL PVGRRGGQMQ 200 

QEEESEEQKD GNSVLSGFSS EFLAQTFNTE EDTAKRLRSP 240 

RDKRNQIVRV EGGLRIINPE GQQEEEEQEE EEKQRSEQQRN 281 

181 GLEETICSLK IRENIAOPAR AOLYNPRAOS I3TAMSLTLP 
141 ILRYLRLSAE YVRLYRNQIY APHWNINANS LLYVIRQEQR 
101 VRIVNSQGHA VFDNKVRKGQ LVWPQNFW AEQAGEEEQL 

61 EYLVFKTNDR AAV8HVOQVF RAT P AD V LAN AFGLRQRQVT 

21 ELKL8GNRQP LVHPQSOSQS N 

Figure 5. Amino Acid Sequence of the Legumin PropolypeptkJe. 

The sequence was derived from a genomic DNA sequence (Baum- 
tein et al., 1986) and is shown without the signal peptide. The 
upper part numbered from 1 to 281 (starting at the amino termi- 
nus) is the sequence of the a chain. The lower part numbered 
from 1 to 181 (starting at the carboxy terminus) is the sequence 
of the 0 chain. 
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Figure 6. Localization of L^gjjmin-lnvertase Fusions in Isolated 
Yeast Vacuoles. 

Vacuoles were isolated (see Methods) and the indicated marker 
enzyme activities were determined in the total lysate and in the 
vacuole fraction. 

(A) and (B) Cofractionatlon of invertase activity of fusions 
Le-lnv-86 (A) and lnv-Le-C76 (B) with the vacuolar marker en- 
zyme o-mannosidase. 



the tevel of expression of our fegumtrvinvertase fusions 
allowed correct results, we also expressed the fusion 
Le-lnv-86 at a much lower level by transferring the fusion 
including ADH1 promoter and terminator from the multi- 
copy plasmid pAAH5 to the yeast plasmid YCp50 (Sher- 
man et al., 1986). After transformation with this plasmid, 
only one copy of the fusion gene was present per yeast 
cell, resulting in a very low expression. With this low 
. activity, variable results were obtained because of back- 
ground interference ranging from 30 to 70% (average 50%) 
secretion (data not shown). This result indicated a certain 
overloading of the vacuolar transport pathway. On the 
other hand, retention values went up to more than 90% 
with long legumin segments. In addition, through the use 
of irruTwnocytochemistry, native legumin expressed with 
pAAH5 could only be detected in yeast vacuoles, and only 
trace amounts of legumin were detected in periplasms and 
medium samples (see above). Together, these results 
indicated that the conclusions derived from the use of the 
yeast expression system are essentially correct. 



Glycosylation of Invertase Provides Additional 
Evidence for Passage through the Secretory System 

Secretory invertase is highly and heterogeneously glyco- 
sylated. This glycosylation occurs in the ER and in the 
Golgi apparatus. Wfld-type yeast cells also produce a 



cytcf>la^mic and, therefore, ungJycosylated form of invert- 
ase (Carlson and Botstein, 1982; Pertman et al., 1982). 
These two forms can be extinguished easily by their 
different mobility in a native acrylarnJde gel (Gabriel and 
Wang, 1969; Carlson et al., 1981). 

All amirKhterrninal and cartwxy-terrninaJ (egumin-invert- 
ase fusions (Figures 4A and 4C) produced in yeast were 
analyzed for invertase glycosylation. In all cases, the in- 
vertase was highly glycosylated, as fllustrated in Figure 7 
for some examples. This means that all fusions enter the 
secretory system and pass the GoJgi apparatus, yielding 
additional evidence for the localization of an intraceOularty 
retained fusions in the vacuole. 



Analysis of the Expression of Legumin-CAT Fusions 
in Plants 

Although the legumin propoly peptide (this paper) and PHA 
(T ague et al.. 1990) are sorted into the vacuoles of yeast, 
it remains to be demonstrated that these cells use the 
same sorting mechanism and recognize the same targeting 
signals as the storage tissue cells in developing plant 
seeds. Field bean, the donor of the legumin gene, still 
cannot be transformed and regenerated efficiently. There- 
fore, we have used the tobacco transformation system to 
verify our results in plants. Like many other seed storage 
protein genes, the legumin gene used in our study is 
correctly expressed in tobacco seeds (Baumlein et al., 
1 987). It has also been shown that plant vacuolar proteins 
such as phaseofin and PHA from common bean and the 
vegetative storage protein patatrn from potato are trans- 
ported to the vacuoles in transgenic tobacco (Greenwood 
and Chrispeels, 1985; Sturm et al., 1988; SonnewaW et 
al., 1989). 

Legumin-CAT gene fusions (Figure 4B) were trans- 
formed into tobacco using the Ti plasmid system. Trans- 
formation and expression of the gene fusions were verified 
by DNA and RNA blotting techniques. The RNA gel blot 
shown in Figure 8 reveals that mRNA is formed from ail 



Table 2. Invertase Activity Produced by Carboxy-Terminal 
Legumin-lnvertase Fusions in Yeast 





Invertase Activity 




Fusion 


Total 


External 


% Secretion 


lnv-Le-C13 


900 


845 


94 


lnv-Le-C38 


5 


3.6 


72 


lnv-Le-C51 


500 


415 


83 


lnv-Le-C57 


440 


250 


57 


lnv-Le-C63 


550 


250 


46 


lnv-Le-C76 


190 


13 


7 


lnv-Le-C76-38 


580 


290 


50 
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of the tetrapeptfcte LORD, see above) because Tague et 
al. (1990) found this tripeptide also in the carboxy-terminal 
region of legumin A from pea, which is homologous to the 
carboxy-terminal part of field bean legumin found to act as 
a vacuolar sorting signal. However, in field bean legumin, 
GLRQR occurs at positions 29 to 25 (numbered from the 
carboxy terminus as in Figure 5) instead of the NLQRN in 
pea legumin. Finally, these sequences are located in the 
carboxy-terminal half of the sorting signal, the deletion of 
which did not lead to complete loss of vacuolar targeting 
but only to a reduction from more than 90% with the fusion 
lnv-Le-C76 to 50% with Inv-Le-C76-38. 

m addition, we screened 67 sequences from our data 
bank of plant vacuolar proteins, mostly targeted into stor- 
age vacuoles of seeds, for the occurrence of the sequence 
LQR. The results are shown in Table 3. Even allowing 
conservative amino acid substitutions (isoleucine and va- 
line for leucine, asparagine for glutamine, lysine for argi- 
nine), only about half of the sequences contained this 
putative signal. For example, in 11S globulins, only the 
A-type subunits (e.g., legumin A of pea) contained the 
tripeptide, even multiple at different sites, whereas it did 
not occur in B-type subunits such as the one used in our 
studies. 

Taken together, no generalizations can be made about 
vacuolar sorting signals and mechanisms. On the one 
hand, relatively short amino acid stretches with similar 
essential core sequences can act as sorting signals, very 
likely by way of a specific receptor, and the deletion of 
short propeptide domains leads to secretion (Chrispeels, 
1991). On the other hand, we found both in yeast and in 
plant cells that vacuolar targeting of legumin is mediated 
by very different, much more complex sequences that 
might form the signal by higher structures and could act 
by way of a less specific mechanism. 



METHODS 



Reagents 

ONA-modifying enzymes were obtained from Boehnnger Mann- 
heim and from Bethesda Research Laboratories and were used 
according to the manufacturers' instructions. DNA sequencing 
reagents were from Pharmacia and Boehnnger Mannheim. Radio- 
active deoxynucteotides and nucleotides were from Amersham. 
For protein gel blot staining. Promega's ProtoBJot Western Blot 
AP System and Amersham's streptavkfin-biotin system were 
used . Novozym 234 and Lysjng Enzyme (cell wan lysing enzymes 
from Trtchoderma harzianum) were from Calbiochem and Sigma, 
respectively. o-Diarusidin, W-ethytmaleirrride, and Rcol) 400 were 
obtained from Sigma. NADPH, p-nitrc>phenyl-a^iuoopyrano- 
side, p-riitTOphen^ac-o^anncpyranoside, and low melting aga- 
rose were from Serva; horseradish peroxidase, glucose oxidase, 
and invertase from Boehnnger Mannheim; and cytochrome c from 
Btomed (Cracow, Poland). Antibiotics for selection of transformed 



Table 3. Occurrence of the Tripeptide LQR in Plant Vacuolar 
Proteins 




NO. Of 


Sequences 




Types of Plant 


With 


Without 




Vacuolar Proteins 


LQR 


LQR 


Total 


HSgtobuBns 


13 


11 


24 


7S globulins 








VTcffin-lfce 


8 


0 


8 


Convicifin-iike 


4 


0 


4 


Lectins 


4 


3 


7 


Albumins 


3 


2 


5 


2S proteins 


1 


3 


4 


Profamins 


0 


11 


11 


Vegetative storage proteins 


0 


4 


4 


Total 


33 


34 


67 



bacteria and transformed plants were obtained mainly from Serva, 
and t4 C-ch!oramphenicol was from Amersham. Antisera against 
CAT were from Hoffmann-LaRoche Inc. (Bums and Crowl, 1987) 
and from 5 Prime-3 Prime Inc.; the legumin antibody used for 
immunocytochemistry was a gift of R. Manteuffel (Gatersleben). 
Affi-Gel was from Bio-Rad and Protein A was from Pharmacia. 



Strains and Media 

Escherichia coii strains were used for piasmid maintenance. £ 
coli JM101 was used for production of single-stranded DNA with 
the helper phage M13K07 (Vielra and Messing, 1988). £ coii 
GJ23 was used as helper strain for conjugation (van Haute et aL, 
1983). Agrobacterium tumefaciens C5801 (rifampin) containing 
the disarmed Ti piasmid pGV3850 (Zambryski et aL, 1983) was 
used for transformation of tobacco (Nicotiana tabacum Petit Ha- 
vana cv SR1 , Horsch et al., 1 985). Nutrient broth Omnumpraparate 
Berlin) was used for growth of £ coii strains; YEB medium 
(VervBet et al., 1975) was used for growth of Agrobacterium 
strains. 

Yeast (Saccharomyces cerevisiae) strain SHY2 (ura3 t Up1 % 
Ieu2 w his3, can) (Botstein et al., 1979) was used for expression of 
native legumin; strain SEY6210 (ura3, trp1 % tet/2, his3, lys2, can, 
suc2-&9) (obtained from S. Emr, Pasadena, CA) was used for 
expression of lec^imin^yertase fusions. For growtfi of legumin- 
bearing SHY2 strains, minimal meolum with 2% glucose was used 
according to Tanaka et al. (1967). For SEY6210 strains bearing 
legumin-invertase fusions, the medium contained 2% fructose as 
carbon source. Ceils for spheroplast formation and vacuole Iso- 
lation were grown in minimal medium according to Wlckerham 
(1946). 



Piasmid Constructions 

AD baste constructions were performed in the phagemid pBS- 
(Stratagene). The intron-free legumin gene was constructed by 
substituting a BamHHCpnl restriction fragment of a legumin cONA 
clone, B273, isolated in our laboratory (Q. Saaibach, unpublished 
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results) for the corresponding intron-containing fragment of a 
legumin genomic clone, LeB4 (Baumtein et at., 1986). An appro- 
priate restriction fragment from pSV2CAT (Gorman et a)., 1982) 
containing the entire coding sequence of the CAT gene was 
inserted 3' of the legumin gene and Le-CAT fusions generated by 
digonucleotide-directed deletion mutagenesis. Approximately 470 
Dp of legumin 3' -untranslated region was fused behind the stop 
codon of CAT as a poryadenylation signal. 

For transformation into tobacco, fusions including the legumin 
promoter were subcloned into the Smal site of the plant interme- 
diary vector pMU1 (gift of L. Herrera-EstreOa). A Sail site was 
generated 6 bp 5' to the ATG translation start of the legumin 
gene, and corresponding Sall-Hindlll fragments of several fusions 
were inserted into the Smal site of pRT103 (Topfer et al., 1987) 
for expression with the CaMV 35S promoter. Hindlll fragments of 
these constructs were also subcloned into pMU1 . An Sphl frag- 
ment comprising the coding region plus 148 bp of 5'-f)anking and 
147 bp of 3'-ftanking regions was isolated from the intron-free 
legumin gene and inserted into the Sphl site of pBS-, resulting in 
pLeSph. 

Plasmid pSUC23 containing a SUC2 gene (Taussig and Carlson, 
1983) was obtained from T. Rapoport (Berlin). A Hindlll fragment 
of this gene spanning bases 12 to approximately 2700 was 
inserted into the Hindlll site of the pLeSph such that both genes 
were in the same orientation. The legumin- invertase gene fusions 
were generated by deletion mutagenesis. For the fusion of car- 
boxy-terminal legumin segments to invertase, an EcoRI-Pstl frag- 
ment from pSUC23 spanning bases -26 (plus pdyiinker) to ap- 
proximately 2270 of the SUC2 gene was subcloned into pBS- by 
way of EcoRI-Pstl. The resulting plasmid pSU was opened with 
Pstl, blunt ended by T4-ONA polymerase, and then cut with Sphl 
(polytinker). A Kpnl(btunt)-Sphl fragment spanning bases 1473 to 
1850 of the legumin gene with 230 bp of 3 '-coding region and 
1 47 bp of 3'-noncoding region was inserted into the opened pSU 
plasmid such that the SUC2 gene and the legumin gene fragment 
were in the same orientation. 

ORgonucleotide-directed in vitro mutagenesis for generation of 
restriction sites and deletions was performed essentially according 
to Zolter and Smith (1983). Oligonucleotides were synthesized on 
an Applied Biosystems 391 DMA synthesizer (F. Machemehl. 
Gatersleben). For deletions, oligonucleotides encoding the junc- 
tion sequences were 30 to 40 bases long. The fusion sequences 
were verified using dideoxy DNA sequencing (Sanger et al., 1 977). 

For expression of legumin and legumin-invertase fusions in 
yeast, Sall-Hindlll (polyflnker) restriction fragments starting 6 bp 
in front of the ATG (see above) were isolated from the pBS— 
piasmids and inserted by way of blunt-end ligation into the Hindlll 
site of the yeast shuttle plasmid pAAHS (Ammerer. 1 983). For the 
carboxy-terminal invertase-legurnin fusions, EcoRI-Sphl frag- 
ments were used in the same way. For low-level expression in 
yeast, a BamHI fragment was isolated from the pAAHS clone 
containing Le-lnv-86. This fragment comprising the ADH1 pro- 
moter, the legumin invertase gene fusion, and the ADH1 terminator 
was inserted into the BamHI site of YCpSO (Sherman et al., 1 986; 
obtained from S. Emr, Pasadena, CA). 



Transformation of Yeast and Tobacco 

Yeast was transformed accorcfing to Ito et al. (1983), and trans- 
formed cells were selected on minimal medium with the required 
supplements. 



Tobacco leaves were used for Agrobact e/fo/n-mecfiated leaf 
disc infection as described by Horsch et al. (1985). Transformants 
were selected on 100 j*g/mL kanamycin. Transformation of the 
genes was verified by DNA gel blotting (Southern, 1975), and 
expression of the genes by RNA gel blotting using the method of 
Thomas (1983). 



Immunological Procedures and Electron Microscopy 

Antibodies against a B-type subunit of legumin (gift of C. Horst- 
mann, Gatersleben) and against a and 0 chains isolated from a 
denaturing reducing pdyacryiamide gel were raised in mice. The 
antigens were bound to Affi-GeJ and used for affinity purification 
of the antibodies. 

Yeast cells were homogenized with dry ice and glass beads in 
a mortar, and the frozen powder was extracted by boiling in SDS 
sample buffer. Tobacco seeds were ground in a mortar under 
liquid nitrogen, and the powder was extracted in 0.1 M phosphate 
buffer, pH 7.5, with 1 M KCf. Spheroplasts and isolated vacuoles 
from yeast as well as cell fractions from tobacco seeds were tysed 
by addition of concentrated SDS sample buffer and boiling. Sam- 
ples were separated on polyacrylamide gels (Laemmfi, 1970) and 
blotted to nitrocellulose according to Towbin et al. (1 979). Leg- 
umin bands were visualized by treating the blots with legumin 
antibodies, followed by biotinylated anti-mouse immunoglobulin G 
and streptavtdin-alkaline phosphatase or by alkaline phosphatase 
immunoglobulin G conjugates, and staining with nitro blue tetra- 
zolium and 5-bromo-4-chkx^rKtolylphosphate. 

For electron microscopy, yeast cells were prefixed with 2% 
glutaraldehyde in cacodytate buffer, pH 7.2. with 1 M sorbose. 
After centrifugation, the cells were wrapped in 3% low melting 
agarose (gel point 26 to 29°C). Pieces of 1 mm 3 were quickly 
frozen in liquid propane (-1 85°C) and stored under liquid nitrogen. 
Freeze-substitution was carried out according to MOIIer et al. 
(1980). At -35°C, the samples were embedded in Lowikryl K4M 
resin, which yields a good preservation of cellular structures and 
low nonspecific antibody binding (Roth et al., 1981; Craig and 
Goodchild, 1982). Embedding was carried out in gelatin capsules 
by UV light polymerization under nitrogen for 24 nr. After 2 days 
of curing, thin sections were cut using glass knives on an LKB- 
Ultrotome. Sections were collected on Formvar film-coated nickel 
grids. Nonspecific binding was blocked by treatment with 0.1% 
BSA in 0.1 5 M phosphate buffer, pH 7.2, containing 0.5% Tween 
20, 0.1% PEG, and 5 mM ammonium chloride for 10 min. The 
grids were incubated on drops of diluted legumin antibody solution 
for 1 5 min and then extensively washed In blocking solution with 
reduced BSA content (100 Vg/mL). Thereafter, the grids were 
treated with protein A-gotd conjugate prepared according to Roth 
(1983) for 15 min and washed again. Finally, grids were washed 
in distilled water and contrasted with uranyl acetate (1% in 
ethanoQ- After washing with pure ethano! and drying, specimens 
were evaluated in a TESLA BS500 transmission electron micro- 
scope at 60 kV. Controls were run without antibody treatment 
and with preimmune serum. 



Assays 

Quantitative invertase assays were performed using the method 
of Goldstein and Lampen (1975), as described by Johnson et al. 
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(1987). External activity was measured using Intact ceOs, total 
activity after lysis of me celts with 0.5% Triton X-100. Invertase 
activity gels were performed using triphenyttetrazolhim chloride 
according to Gabriel and Wang (1969). CAT activity was assayed 
according to Gorman et al. (1982). 

Spheroptasts of yeast were prepared as described by Maraz 
and Sub* (1981) using lysing enzymes at 3 to 5 mg/mL Spher- 
oplasts were lysed and vacuoles isolated on a Rcofl step gradient 
according to Stevens et al. (1982). The vacuole fraction and the 
original sphenopJast lysate were assayed for Invertase and marker 
enzyme activities. a-Mannosidase was assayed according to 
Opheim (1978), NADPH cytochromec reductase was assayed as 
described by Kubota et a!. (1 977X and a-gtucoskiase was assayed 
according to Halvorson and EDas (1958). 
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Uwe Sonnewald 1 *, Monika Brauer 2 , Antje von 
Schaewen 1 t, Mark Stttt 2 and Lothar Willmitzer 1 

^Institut fur Genbiologische Forschung Berlin GmbH, 
Ihnestrasse 63, 1000 Berlin 33 and 
^Lehrstuhl fur Pflanzenphysiologie, Universitat Bayreuth, 
Universitatsstrasse 30, 8580 Bayreuth, Germany 

Summary 

tn higher plants sucrose plays a central role with 
respect to both short-term storage and distribution of 
photoassimilates formed in the leaf. Sucrose is synthe- 
sized in the cytosol, transiently stored in the vacuole 
and exported via the apoplast. In order to elucidate the 
role of the different compartments with respect to 
sucrose metabolism, a yeast-derived invertase was 
directed into the cytosol and vacuole of transgenic 
tobacco plants. This was in addition to the targeting of 
yeast-derived invertase into the apoplast described 
previously; Vacuolar targeting was achieved by fusing 
an N-terminal portion (146 amino acids long) of the 
vacuolar protein patatin to the coding region of the 
mature invertase protein. 

Transgenic tobacco plants expressing the yeast- 
derived invertase in different subcellular compartments 
displayed dramatic phenotypic differences when 
compared to wild-type plants. All transgenic plants 
showed stunted growth accompanied by reduced root 
formation. Starch and soluble sugars accumulated in 
leaves indicating that the distribution of sucrose was 
impaired in all cases. Expression of cytosolic yeast 
invertase resulted in the accumulation of starch and 
soluble sugars in both very young (sink) and older 
(source) leaves. The leaves were curved, indicating a 
more rapid cell expansion or cell division at the upper 
side of the leaf. Light-green sectors with reduced 
photosynthetic activity were evenly distributed over 
the leaf surface. With the apoplastic and vacuolar 
invertase, the phenotypical changes induced only 
appear in older (source) leaves. The development of 
bleached and/or necrotic sectors was linked to the 
source state of a leaf. Bleaching followed the sink to 
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source transition, starting at the rim of the leaf and 
moving to the base. The bleaching was paralleled by 
the inhibition of photosynthesis. 

Introduction 

Photosynthesis : . represents the .major source for e ner 9Y 
used to support biological processes in all living organisms. 
As primary products of photosynthesis, carbohydrates are 
formed in photosynthetically active cells and organisms. 
In higher plants, leaves and to a certain extent also other 
parts of the plant, e.g. stem tissue, represent the primary 
sites for photosynthesis. In contrast, other parts of the 
plant, e.g. roots, seeds or tubers, do not contribute signifi- 
cantly to the whole energy gain reached via photosynthesis 
but rather are largely dependent on carbon dioxide fixed in 
other photosynthetically active parts of the plant. Thus 
there is a net flow of energy from photosynthetically active 
tissues, representing the sources (defined as net exporters 
of fixed carbon), to photosynthetically inactive tissues of 
the plant, representing the sinks (defined as net importers 
of fixed carbon). 

The primary products of carbon fixation are starch and 
sucrose. Whereas starch in leaves mainly serves as an 
intermediate deposit for the products of carbon fixation, 
sucrose, in addition to representing a storage form for the 
products of photosynthesis, plays a central role in the 
distribution of photoassimilates throughout the plant, 
especially the supply of photoassimilates to sinks. 

Due to its dual role, sucrose is present in several different 
compartments. Sucrose is synthesized in the cytosol and 
is transiently stored in the vacuole. Finally, sucrose is 
probably exported from the leaf by transferring it into 
the apoplast followed by active uptake into the phloem 
(Turgeon. 1 989). Although the involvement of the different 
compartments with respect to sucrose metabolism and/ 
or distribution has been known for many years, little is 
known about the role of the different compartments in 
relation to sucrose metabolism. 

Here we describe a new means of approaching this 
problem. This method takes advantage of our ability to 
create transgenic plants containing foreign genes, thus 
allowing the modulation of endogenous genes and/or 
ectopic expression of alien gene products. These plants 
can be disturbed at a particular step in the biosynthesis, 
and the storage and/or distribution of certain metabolites 
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can then be analyzed, by various biochemical and 
physiological means and compared to near-isogenic wild- 
type plants. This makes it possible to follow the influence 
of this disturbance at the molecular level. 

In order to introduce such disturbances into the metabo- 
lism of sucrose, we decided to express a foreign (yeast- 
derived) invertase in the cytosol and vacuole of transgenic 
plants in addition to its expression in the apoplast described 
previously (von Schaewen et al. t 1990). This invertase 
should give rise to cleavage of sucrose yielding glucose 
and fructose and thus should interfere with the normal 
biosynthesis, storage and distribution of sucrose. The 
yeast invertase was chosen for two reasons: first, this 
enzvme should not be inhibited by endogenous invertase 
inhibitors present in plants, and secondly, this enzyme is 
known to be active over a broad pH range (Goldstein and 
Lampen, 1976), which is important in relation to the different 
pH values of various subcellular compartments. 

Whereas secretion of foreign proteins has already been 
described in transgenic plants (Dorel et a/., 1989; von 
Schaewen et a/., 1 990), no data have as yet been reported 
on the targeting of chimeric non-plant-derived proteins 
into the vacuole of transgenic plants. Secreted or vacuolar 
proteins are synthesized as pre-proteins with a hydrophobic 
N-terminal signal peptide. In plants it has been shown that 
the signal peptide is required for secretion of chimeric 
proteins. Further information is needed for the targeting of 
foreign proteins into the vacuole (Dorel et a/., 1989). In 
the case of p-1,3 glucanases it has been speculated that 
vacuolar and secreted forms differ by C-terminai extensions 
being present in the cDNAs of the vacuolar but not the 
secreted proteins (Van den Bulcke ef a/., 1989). Furthermore, 
removal of the C-terminal propeptide of the barley lectin 
gene led to the secretion of the truncated protein in 
transgenic tobacco plants, indirectly demonstrating the 
vacuolar targeting function of the C-terminal propeptide 
(Bednarek et a/., 1990). In yeast, however, it was shown 
that certain N-terminal regions of either plant-derived or 
yeast-derived vacuolar proteins are sufficient for achieving 
vacuolar targeting of chimeric proteins Hague et at. , 1 990* 
Vails et al., 1987). Here we describe how the fusion of a 
large N-terminal-derived portion of the vacuolar protein 
patatin to the mature invertase protein from yeast results 
in vacuolar targeting of the invertase. To the best of our 
knowledge, this represents the first example for directing 
of a foreign protein into the vacuoles of a transgenic plant. 

The transgenic plants expressing the yeast-derived 
invertase either in the cytosol, the vacuole or the apoplast 
show a clear phenotype which varies with the compartment 
m which the invertase is expressed and also depends on 
the developmental stage of the plant. Despite the fact that 
plants expressing the invertase in the different compart- 
ments show distinct differences, there are certain similarities 
on both the visual and the biochemical phenotype. These 



experiments therefore prove the central importance of the 
compartmentation of sucrose with respect to its bio- 
synthesis, storage and distribution. Furthermore, these 
plants demonstrate the power of reversed genetics for 
analyzing and understanding a variety of physiological 
problems. 



Results 

Construction of chimeric genes directing yeast invertase 
into the cytosol or the vacuole 

Eukaryotic cells develope specialized compartments to 
separate different anabolic and metabolic ^ functions. To 
direct newly synthesized proteins to their proper destination, 
cellular receptor proteins and specific targeting signals of 
newly synthesized proteins interact. To manipulate bio- 
chemical pathways in any given organelle the isolation of 
suitable enzymes as well as targeting signals is required. 

Proteins are synthesized in the cytosol and no further 
information is needed if they are to remain there. Cytosolic 
accumulation of yeast invertase was achieved via a fusion 
between the 5'-untranslated leader of the proteinase 
inhibitor II (Pl-ll) gene from potato and the yeast sue 2 
coding region (subsequently called Cy-INV t Figure 1). 

Secreted or vacuolar proteins are synthesized as pre- 
proteins with a hydrophobic N-terminal signal peptide. As 
described above, in the case of yeast it has been demon- 
strated that targeting signals needed to achieve vacuolar 
localization of foreign proteins are present in the N-terminal 
portion of either plant- or yeast-derived vacuolar proteins. 
We therefore decided to test whether or not this would 
also hold true for chimeric genes in plants. Therefore, a 
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Figure 1. Structure of the cytosolic (Cy-iNV) and the vacuolar (V-iNV) 
invertase gene. * ' 
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DNA fragment of the patatin genomic clone pgT5 (Rosahl 
era/.. 1986), containing parts of the 5'-untranslated leader, 
the 23 amino acid signal peptide and also the 1 23 N-terminal 
amino acids from the mature patatin protein, was fused 
with the sue 2 gene from yeast (Figure t) in order to 
achieve vacuolar targeting of the chimeric protein (sub- 
sequently called V-INV). 

Chimeric yeast invertase is active in all three 
compartments 

The chimeric yeast invertase genes were transformed into 
tobacco plants using ^grobacte/7U77?-rnediated gene 
transfer. Transgenic tobacco plants arising after kanamycin 
-^election were analyzed for the presence of intact nbn- 
rearranged genes by DNA gel blotting (data not shown). 
Only plants containing intact copies of the chimeric genes 
were used for further analysis. In each case, about 50 
independent transformants were tested for invertase activity 
using the invertase gel assay system. Five plants of each 
of the individual invertase constructs displaying different 
amounts of the alien invertase activity were transferred to 
the greenhouse for further analysis. 

In order to determine whether the different invertase 
constructs were active after targeting them to different 
compartments, total protein was isolated from leaves of 
transgenic tobacco plants and separated on SDS-PAGE. 

As shown in Figure 2, invertase activity could be detected 
under semi-native conditions in SDS-PAGE in all three 
compartments. It is interesting to note that the cytosolic 
invertase migrates with much higher mobility than the 
secreted and vacuolar forms. This difference cannot be 
explained only by the fact that the cytosolic form is not 
glycosylated whereas the two other forms are highly glyco- 
sylated (data not shown), but probably also results from 
the fact that only the glycosylated form produces oligomeric 
aggregates, whereas the cytosolic form is present as a 
monomer, as deduced from its migration (cf. Esmon et 
al ., 1987). During leaf development an increase in invert- 
ase activity is found for the vacuolar and the secreted 



invertase, whereas the activity of the cytosolic form de- 
creases (Table 1 ). A further decrease in the activity of the 
cytosolic invertase occurs during plant ageing. 

A fusion protein between patatin and yeast invertase is 
efficiently targeted to the vacuole of transgenic tobacco 

The connect targeting of the patatin-invertase fusion protein 
into plant vacuoles was verified using three different experi- 
mental techniques. 

First, transient expression experiments were carried out 
where the lA/A/V gene was introduced into Arabidopsis 
mesophyll protoplasts. After a cultivation period of 2-3 
days, protoplasts were separated from the cultivation 
medium and both fracfo 

invertase activity (Figure 3). The patatin-invertase fusion 
protein was only detectable in protoplasts (Figure 3, lane 
5) and was not secreted into the culture medium (Figure 3, 
lane 6). When constructs expressing the secreted chimeric 
yeast invertase were introduced into protoplasts as a 
control, invertase activity in both the protoplasts and 
culture medium increased. Invertase activity only increased 
in the protoplasts when constructs expressing the cytosolic 
invertase were introduced into protoplasts as controls. 
Untransformed control protoplasts did not show any 
invertase activity under the experimental conditions used 
(Figure 3, lanes 7 and 8). 

The transient expression data alone showed that the 
lA/A/lAencoded protein was not secreted but stayed 
within the protoplasts. The V-//W-encoded protein was 
completely retained on a concanavaiin A (ConA) Sepharose 
column, whereas the Cy-/A/V-encoded protein did not 
bind to ConA. This suggests a glycosylation of the V-INV- 
encoded protein, which therefore makes a cytosolic location 
unlikely, but a location within the endoplasmic reticulum- 
Golgi pathway has to be assumed. 

Cell fractionation via differential centrifugation of 
Arabidopsis protoplasts after transient expression demon- 
strated that the patatin-invertase fusion protein was not 
precipitated with the endoplasmic reticulum-Golgi vesicles 



Table 1. Effect of leaf age on invertase activity 



Invertase activity (fimol sucrose rn~ 2 sec" 1 ) 



Leaves 


Sector 


Wild-type 


Cy-INV 


Cw-INV 


V-INV 


Young 3 


Green 


0.54 ± 0.04 


25.79 ± 2.66 


11.02 ±0.44 


3.55 ± 0.62 


Middle 15 


Green 
Light green 


0.20 ± 0.01 


5.09 ± 0.61 
6.67 ± 1.36 


26.50 ± 8.47 
39.44 ± 13.26 


4.90 ± 1.51 
7.00 ± 0.55 


Old c 


Green 
Light green 


0.16 ±0.05 


1.71 ±0.16 
1.38 ±0.12 


42.99 ± 7.30 
56.67 ± 3.52 


5.22 ± 1 .72 
6.00 ± 0.56 



a Leaf length less than 10 cm. 

b Leaf length between 10 and 15 cm. 

c Leaf length between 20 and 25 cm. 
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Figure 2. SDS-PAGE separation showing that chimeric yeast invertase is 
active in the cytosol, vacuole and apoptast of transgenic tobacco. 
Protein extracts (50 »ig total soluble protein each lane) were separated by 
SDSj-PAGE (omitting the boiling step before etectrophoretic separation, 
leaving active and intact invertase aggregates) and assayed for invertase 
activity (see Experimental Procedures). No invertase activity is detectable 
in nontransformed tobacco plants (control). In transformed tobacco plants 
a new invertase activity is detectable for the vacuolar (V-INV), the 
cytosolic (Cy-INV) and the cell wall (Cw-INV) transforrnarrts. 

(Figure 4, lane P3) but was present in the soluble fraction 
(Figure 4, lane Sol). 

As a final proof for the localization of the lA/A/V-encoded 
protein, vacuoles from transgenic tobacco mesophyll 
protoplast were isolated. After their isolation, the activity 
of the vacuolar marker enzyme a-mannosidase was deter- 
mined for intact protoplasts and isolated vacuoles. The 
same amount of a-mannosidase activity for both proto- 
plasts and vacuoles was loaded on SDS-PAGE and the 
invertase activity determined. As seen in Figure 5, invertase 
activity was detectable in isolated vacuoles to the same 
extent as in intact protoplasts. These experiments clearly 
demonstrate that the chimeric yeast invertase segregates 
with the vacuolar marker enzyme a-mannosidase, thus 
confirming its vacuolar localization. 

Expression of chimeric yeast invertase in the cytosol, the 
vacuole and the apoplast leads to the accumulation of 
carbohydrates in leaves of transgenic tobacco 

Accumulation of yeast invertase in different compartments 
of the ceil is expected *to interfere with the synthesis, 
transport and/or storage of sucrose. Consequently, changes 
in the carbohydrate pools of transgenic plants were investi- 
gated. In all cases starch accumulated in leaves and was 
not degraded during the dark period (Table 2), indicating 
that the leaf was not able to transport the fixed carbohydrate 
to other parts of the plant. Surprisingly, sucrose increased 
in all the transformants. This increase has been consistently 




Figure 3. Transient expression of the different chimeric invertase genes 
in Arabidopsis thatiana protoplasts. 

Mesophyil protoplasts isolated from axenically grown Arabidopsis plants 
were transformed with plasm Id DNA of the different invertase constructs. 
After 72 h protoplasts were pelleted and the proteins present in the 
supernatant (lanes 2, 4, 6 and 8) and the pellet (lanes 1. 3, 5 and 7) were 
separated by SDS-PAGE and subsequently assayed for invertase activity. 
Lanes 1 and 2, cell wall invertase (it is important to note that in other 
experiments less invertase was detectable in the protoplasts, which might 
be explained by a different capability of the protoplasts to secrete proteins). 
Lanes 3 and 4, cytosolic invertase. 
Lanes 5 and 6, vacuolar invertase. 
Lanes 7 and 8, untransformed control protoplasts. 




Figure 4. Cell fractionation of Arabidopsis protoplasts after transient 
expression of the V-INV gene. 

Following transient expression cells were lysed and subjected to a series 
of centrifugation steps. After removal of cell debris (700 g for 5 min) the 
supernatant was centrifuged at 10 000 g for 10 min, yielding pellet P2 
(containing large organelles). The resulting supernatant was subjected to 
a high-speed centrifugation (80 000 g for 2 h) to pellet membranes and 
vesicles (pellet P3). The membrane-free supernatant (Sol) contains both 
the cytoplasmic and the vacuolar cell fraction. All fractions were brought to 
the same volume and 50 of each fraction were subjected to SDS-PAGE 
The gel was developed for invertase activity. 

observed for the cell wall invertase in three different groups 
of tobacco plants (see for example, von Schaewen et a/., 
1990; Stitt et al., 1990), as well as in Arabidopsis trans- 
formants (von Schaewen et a/., 1990). In other experiments, 
we have seen that sucrose sometimes increases (see 
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4 Figure S. Detection of chimeric invertase in the vacuoles of transgenic 
tobacco. 

, Vacuoles were isolated from mesophyll protoplasts of transgenic tobacco 
pfaiits carrying the V-tNVgene and assayed for invertase activity after gel 
electrophoresis. The invertase activity in the vacuole was compared to the 
invertase activity present in intact protoplasts after loading the same 
amount of a-mannosidase activity (equivalent to 250 ng of 'jack bean* 
a-mannosidase; see Experimental Procedures) onto each lane. 
Uritransformed control: lane 1, protoplasts; lane 2, isolated vacuoles. 
Transgenic tobacco expressing the V-INV gene: lane 3, protoplasts; iane 
4, isolated vacuoles. 

Table 2), but may remain unaltered or, in the case of the 
transformants expressing the vacuolar or cytosolic form, 
decrease slightly (!V1- Brauer, unpublished results). The 
experiments reported in Table 2 were carried out with 
plants growing in a greenhouse during the summer, whereas 
experiments in which sucrose did not accumulate were 
carried out with plants from growth chambers kept at 
lower irradiances. 

Differences were found in the relative amounts of glucose 
and fructose (Table 2). The cleavage of sucrose via 
invertase action would lead to glucose and fructose in a 
ratio of 1:1, which is indeed found in the case of cell wall 
invertase. Cleavage of sucrose within the cytosol leads to 
a 5-1 0-fold larger accumulation of fructose than glucose. 
This excess of fructose (sometimes being as large as 40- 
fold) has consistently been observed in experiments on 



three separate groups of plants. In the case of the vacuolar 
transformant, there was a 2-4-fold excess of glucose over 
fructose in these plants (Table 2) but this difference was 
not present in other experiments carried out with plants 
from a growth chamber (data not shown). 



Accumulation of carbohydrates leads to drastic changes 
in the development and habit pf transgenic tobacco plants 

During tissue culture, transgenic tobacco plants expressing 
yeast invertase in the different cqmpartments were mainly 
characterized by a reduced growth compared to untrans- 
formed control plants. This mild phenotype changed 
dramatically after transfer of tha transgenic plants to soil 
and their subsequent growth in tH# greenhouse. Although 
the expression of the different invertase constructs resulted 
in distinct phenotypes they had some features in common. 
As a consequence of the invertase/ activity the plants 
showed a reduced height compared to control plants 
(Figure 6a). The reduced height was accompanied by? 
reduced leaf expansion (data not shown) and root develop- 
ment (Figure 6e). The reduced growth of the vegetative 
organs led to late flowering and reduced seed setting (data 
not shown). Thus decreasing the amount of sucrose 
available for distribution prolonged the life of the transgenic 
plants for several months. 

In addition to the phenotypic changes mentioned above, 
transformants harboring the cytosolic yeast invertase 
were characterized by curled leaves (Figures 6b and 7b). 
The curling started early in development and was seen on 
very young leaves. Evenly distributed light-green sectors 
algo developed on the leaf surface and these bleached 
completely during leaf ageing. 

Symptom development of plants carrying the ceil wall 
and the vacuolar yeast invertase depends on the develop- 
mental stage of the leaf. Very young leaves showed no 
phenotypical changes. In the case of the apoplastic yeast 
invertase, bleached and/or necrotic sectors developed on 
mature source leaves following different patterns (described 
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Table 2. Carbohydrate turnover in invertase-expressing transgenic tobacco plants 

Carbohydrate turnover (mmol hexose m" 2 ) 



Dark 3 



Light 5 



Genotype 


Glucose 


Fructose 


Sucrose 


Starch 


Glucose 


Fructose 


Sucrose 


Starch 


Wild-type 


0.18 


0.12 


0.21 


0.44 


0.22 


0.16 


0.77 


9.19 


Cy-fNV 


0.47 


6.7 


0.44 


7.0 


1.1 


4.72 


1.4 


16.4 


Cw-INV 


9.8 


8.1 


4.5 


8.3 


10.9 


9.8 


11.7 


10.3 


V-INV 


14.6 


5.85 


1.9 


16.4 


19.8 


4.03 


4.23 


19.5 



a Measured after 1 6 h darkness. 
b Measured after 8 h illumination. 
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r^^J, nflUen ? ° f chhneric y** 31 invertase on development and habit of transgenic tobacco 
. tt^T^^ ^ ^ h *• ^ * *. cen wall 2 ) and the ,acuo te <y-, W 4) 

mmS^folZ!^ ° f tranSgenlc tobacco is **** for each mutant, Cy-/A/V(b). C^/W(c) and ^//Vl/(d) 

(e) Impaired root formation of .nvertase-expressing tobaco plants: (1) wild-type, (2) Cy-INV; (3) Cw-INV; (4) VVW* 



in von Schaewen et a/., 1990). In transgenic plants express- 
; Jn^evacuolar, invertase, bleac^^ 

sink-source transition zone. Symptoms first became 
visible at the rim of the leaf and moved towards the base 
during leaf expansion (Figures 6c and 7a). It is important 
to note that, although the phenotypes of the cell wall and 
the vacuolar yeast invertase are quite similar, much less 
vacypjanny^ needed to reduce the liability 

of transgenic tobacco plants. 

The specific phenotype described for each of the different 
invertase constructs was consistently observed for all 



plants tested in the greenhouse. This excludes sornaclonal 
variation as the cause for the changed habit. The pheno- 
.J^E?.was transmitted to the proge^ 
J^?^ activity. Furthermore, 

°lW e J? rel 9 n inv ertase. Plants of the F t progeny of a 
vV/vVtransformant which showed only a mild phenotype 
in the parental generation are shown in Figure 8. A com- 
parison of the severity of the phenotype with the invertase 
activity suggests that the invertase activity must be above 
a certain threshold value (=* 4.5 ^mol sucrose rrr 2 sec" 1 ) 
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Figure 7. Development of symptoms is associated with the 
developmental stage of a leaf. 

(a) Clockwise arrangement of leaves taken from a transgenic tobacco 
plant expressing cytosolic yeast invertase: a-d, young to old leaves. 

(b) Clockwise arrangement of leaves taken from a vacuolar-invertase 
expressing transgenic tobacco plant: a-d, young to old leaves. 

(cf. also Figure 6d and Table 1) to influence development 
of the plant. 



Photosynthesis is reduced in all transgenic plants 
expressing chimeric invertase proteins 

As described above, expression of yeast invertase in 
transgenic tobacco plants leads to bleached areas, due to 
the loss of chlorophyll (data not shown). The percentage 
of bleached regions increased during leaf maturation (Table 
3). In order to determine the rate of photosynthesis in 
relation to symptom development, leaves were separated 
into green, light-green and pale areas. As shown in Table 
4, the rate of photosynthesis follows the development of 
symptoms. In young leaves (< 1 0 cm length) there was no 



Figure a. The development of symptoms follows the expression of the 
lA/ZvVgene in the F, progeny of transgenic tobacco plants. ^" 
The progeny of transformant V-INV-65 was tested for soluble, neutral 
invertase activity and the development of the phenotype was followed in 
the greenhouse. ... 
The activities were: plant t , 0.1 funol sucrose m" 2 sec -1 ; plant 2, 5.5 jimo* 
sucrose m~ 2 sec -1 ; plant 3, 4.5 junol sucrose m" 2 sec" 1 ; plant 4, 
2.8 \imo\ sucrose m~ 2 sec -1 . 

visual phenotype in plants expressing the cell wall invertase 
or vacuolar invertase and photosynthesis was even slightly 
higher than in the wild-type, whereas plants expressing 
the cytosolic invertase already showed an inhibition of 
photosynthesis even in those sectors of the smaller leaves 
which were visually unaffected (Table 4). As the leaves 
mature the appearance of a visual phenotype in the cell wall- 
or vacuolar-invertase expressing plants was accompanied 
by the in^^tiojxpl.photpjs.ynthesis. This inhibition was \ 
restricted to the visually affected areas of the larger leaves i 
(Table 4). The green areas had a similar or even higher rate 
of photosynthesis than the wild-type (see also von 
Schaewen et a/., 1990; Stitt et al., 1990). The inhibition of 
photosynthesis in cytosolic invertase plants (relative to 
wild-type leaves of a comparable size) did not increase in 
the older leaves. 
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Table 3. Development Qf symptoms in invertase-expressing 
transgenic? tobacco plants 3 





Percentage of total leaf area for each phenotype 


(^rrptype 


l^f ler^h(crn) 


Green 


Light-green . Pate 




* . ' • 

1Q-15 


22-5 
35.0 


■ Ttis - • - 

65.Q 




1^20 
ifelS 


32:6 

89# 
9&8 . 


57.7 $J 

107 : t 




15-20 
10*15 


4*0 






<10 


874} 





scfee^ng,^ 



ThQ overall rate of photosynthesis of a le^f vyiljNae^fftJ 
or^jthe pefc#ntage of the area which ha§ cJgveippe^ ; S 
phenotype (Table 3), (ii) the inhibition of pihotb^yhtheliS 
(Table 4) anc* (iij) the si?© of the leaf (Table 4). As gnly p^art 
pf the leaf is ^ffect^ and tVe.grwn;ar^^!^n hiQh pr 
enhanced rates of photosynthesis, the avdrage\jBtd'p^ir 
unit area remains relatively high (fable 5). Hpwey^r, ttle 
total yield of photosynthesis will be decreased due to the 
fact that the leaves of invertase transformants are smaller. 



Tabled Overall rate of photosynthesis® 



Leaf length (cm) 




Rate of photosynthesis 

fnereentana of u/ifri-tvnol 




Cv-INV 






20-25 




83.9 


79-1 


15-20 




112,0 


93.4 


10-15 


49.4 


99.9 


#5.1 


< 10 


57-3 , 


1Q6.0 


115.0 



B The i^e bf php^ the dr%rent 

sectors Qf inyefi^e^^rtg leaved the overall rate of pri^to- 
syrrtrt^i j per unit area; yy£$ (Calculated after counting the portion 
of each intfryidual sector^ 



l^rg^ingpf^irrieriQ yegsi inV&rt$$e into th& vapuofe 

In qrder r^^pply the tobls?c^r^^eil Jjgenetics to physio- 
logical prbbiems, efficient wSys c^t^eting fprejgn proteins 
into different subcellMiar compartments are needed because 
many bibchemicial prc^esses are highly compart 
Protein synthesis takes plaice in the cytosol and from here 
the proteins have to be targeted to their proper destination. 
The transport of proteins into mitochondria or chldroplasts 
is mediated by the N-terminai transit peptide (Chua and 
Schmidt, 1979). The first step in the secretion of proteins 
is the uptake of the prec utspr protein into the lumen of the 
endoplasmic reticulum (Blobei, 1980). In the absence of 



Table 4, Rate of photosynthesis in different transgenic tobacco plants 3 



Steady-state rate of photosynthesis for each 
phenotype {^mo\ O z m~ 2 sec' 1 ) 

Genotype Leaf length (cm) Green Light-green Pale 



Wild-type 25-30 
20-25 
15-20 
10-15 
< 10 



Cy-INV 


15-20 


30.75 ± 3.85 


11.53 ±1.27 






10-15 


19.20 ±6.56 


10.56 ±2.27 






< 10 


14.52 ± 1.35 


6.50 ±2.10 




Cw-INV 


20-25 


30.18 + 2.88 


18.34 ±0.85 


4.21 ± 2.42 




15-20 


30.03 ±1.32 


17.57 ±1.15 






10-15 


27.50 ± 2.87 


27.00 ± 0.89 






< 10 


18.85 ±1.65 






V-INV 


15-20 


24.57 ±3.15 


18.04 ±2.91 


3.98 ±1.89 




10-15 


23.41 ±1,64 


23.34 ±1.92 






< 10 


20.45 ±1.4 







a The rate of photosynthesis was determined at 20°C, saturating C0 2 and 
600 p.mol photons m~ 2 sec -1 . 



25.25 ± 0.83 
24.79 ±1.95 
24.94 ± 3.25 
27.47 ± 3.63 
17.78 ±3.70 
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retention (Munro and Pelham, 1987) or vacuolar targeting 
. signals, the proteins will be secreted along the default 
pathway (Dorel etat., 1 989). Targeting to the vacuole is not 
well understood. Studies on the N-glycosyiation of vacuolar 
proteins (Sonnewald et a\. % 1990; Voelker et aL, 1989) 
demonstrated that the targeting signal must be part of the 
polypeptide chain of vacuolar proteins. In the case of the 
barley lectin, the C-terminal propeptide was shown to be 
necessary for proper targeting of this lectin to vacuoles of 
transgenic tobacco plants (Bednarek era/., 1 990). To date 
there have been no reports in the literature of the successful 
targeting of chimeric proteins to the vacuole of higher 
plants. 

y In order to obtain vacuolar targeting of yeast invertase, 
I we have fused the first 123 N-terminal amino acids of the 
l\ mature patatin protein to the sue 2 coding region. Our 
J| results are the first exam ple o f the target jng . of non- 
Jr vacuolar protein to vacuoles of tra^^icjpbacca .Further- 
more, the fusion protein provides enzymatic activity and 
can be used in physiological studies designed to investigate 
the role of sucrose in the different cellular compartments. 

Alteration in invertase activity as the leaf develops 

The constructs were all under the control of the 35S- 
CaMV promoter so that changes in their rate of accumula- 
tion are presumably due to the stability of the invertase 
proteins in the different compartments. Cell wall protein 
(von Schaewen ef a/., 1990) accumulated during leaf 
development and reached the highest levels of activity 
compared to the other invertase forms, although the amount 
of detectable RNA was lowest (data not shown). Cytosolic 
invertase activity was highest in young leaves and de- 
creased as the leaf matured. This might indicate that 
different and/or more proteases are expressed during leaf 
development. The activity of the vacuolar invertase was 
found to be rather low compared with the other forms, 
even though the amount of RNA was highest (data not 
shown), which might indicate a lower stability of the fusion 
protein in the vacuole. 



The amount of invertase with which the plant can cope 
depends on its subcellular location 

As expected, the plant is very sensitive to invertase in the 
cytosol, and less sensitive to invertase in the cell wall. The 
cytosolic invertase will directly interfere with sucrose 
synthesis, whereas cell wall invertase will only be effective 
when it is in the right place to interfere with the passage 
of sucrose to the phloem. It is surprising that tobacco is 
so sensitive to an invertase in the vacuole. This indicates 
that the flux of cartoon between the cytosol and the vacuole 
can be quite high. Rapid turnover of the vacuolar pool of 



sucrose has been reported previously for pea and barley 
(Borland and Farrar, 1988; Farrar and Farrar, 1986). 

The time-course of the visual phenotype depends on the 
site at which invertase is expressed 

The plants with cytosolic invertase already showed their 
full phenotype in young leaves. This shows that sucrose 
synthesis is obligatory in young leaves, even if they do not 
export sucrose. The sucrose is presumably used as a 
temporary storage product and can be regulated far better 
than free hexoses. 

In plants with cell wall or vacuolar invertase the phenotype 
developed later, probably during the sink-source transition. 
This is easily explained in plants with cell wall Invertase, 
because from the time of this transition onwards the cell 
wall invertase will gain ready access to sucrose moving 
through the apoplast to the phloem (von Schaewen et a/., 
1990). Since, in the case of the cytosolic invertase trans- 
formant, the phenotype had already developed in young 
leaves (see above), we conclude that the phenotype is 
caused by a perturbation of the sucrose metabolism and 
accumulation in the leaf, rather than other developmental 
changes associated with the sink— source transition 
perse. 

The delaY^„deyelopment of the phenotype in plants 
with vacuolar invertase is unexpected^ the activity of 
invertase did riot increase greatly during leaf expansion. 
One possible explanation would be that the permeability 
of the tonoplast for sucrose increases during the later 
stages of leaf development. This could represent a mechan- 
ism to allow leaves to stop storing sucros e in th e vacuole 
(for "osmotic or short-term storage purposes) and start 
exporting it. Another, although much more speculative, 
possibility is that sucrose has to pass through the vacuole 
in order to become exported. 

Changes of carbohydrate depend on the subcellular 
location of the invertase 

In afl cases, invertase resulted in increased levels of free 
hexoses. For the ceil wall invertase, glucose and fructose 
were present in approximately equal amounts (see also 
von Schaewen etai, 1990; Stitt et aL, 1990), as expected 
from the hydrolysis of sucrose. The vacuolar transformant 
showed a 2-4-fold excess of glucose over fructose in 
these "e^Sments, ^ou^^tfiis^ffer^"c59 hl^noFalways 
been observed. One possible explanation is that the 
vacuolar transformant accumulated large amounts of 
starch; we have observed in other experiments that wild- 
type tobacco often contains an excess of glucose when 
the leaves have a high starch content (W.P. Quick, personal 
communication). However, more experiments are required 
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because several other explanations are also possible, 
including preferential transport of fructose out of the 
vacuole, or accumulation of fructans in the vacuolar 
transformants. 

In contrast, fructose accumulated to a 5-1 0-fold excess 
over glucose in cytosolic invertase plants. This has been 
observed consistently in other experiments. The accumu- 
lation of fructose could be explained either by the absence 
of sufficient fructokinase activity, or by inhibition of fructo- 
kingsf in these plants. Preliminary experiments (M. Brauer, 
unpublished data) indicate that these plants contain fructo- 
kinase but that the enzyme is very susceptible to product 
inhibition by fructose, as has also been reported for fructo- 
kinase from soybean nodules (Copelgnd and More II, 1985) 
and maize kernels (Doehlert, 1989). Tobacco leaves also 
contain a broad-specificity hexokinase, but this enzyme 
has a fivefold lower affinity for fructose than for glucose 
(M- Brauisr, gnpyblished data). These results could explain 
the observed imbalance between glucose and fructose in 
the cytosolic invertase transformant. 

This explanation, if correct, implies that most of the 
hexoses in the cell wall or vacuolar transformant are 
located in the apoplast or vacuole, i.e. after increasing 
the activity of invertase dramatically in these compartments, 
the transport of hexoses back into the cytosol may have 
become limiting. Accumulation of hexoses outside the 
cytosol could also result in osmotic problems. In this 
context, it is interesting that vacuolar and cell wall trans- 
formants share the phenotype feature that the leaves 
remain gr§en and do not accumulate sugars in the vicinity 
of the major vascular bundles. The supply of water in and 
around the vascular bundles will be large and water flow 
in the transpiration stream would also (in case of the cell 
wall invertase) tend to move sugars in the apoplast away 
from the vascular bundle and concentrate them in the 
intervening lamellar sectors (von Schaewen era/., 1990; 
Stitt et a/., 1990). This provides an explanation for the 
mosaic phenotype observed in these plants. 

One rather surprising feature of our results is that the 
sucrose concentration in the leaves of the invertase trans- 
formants often increases. This has been observed con- 
sistently for the cell wall transformants, and occasionally 
for the cytosolic and vacuolar transformants. The variation 
in the case of the latter two transformants may depend on 
the growth conditions, with sucrose accumulating under 
high light conditions. More studies are required, but our 
results indicate that disruption of sucrose metabolism by 
invertase may sometimesHead to direct inhibition of tran- 
sport, in addition to effects caused by hydrolysis of sucrose 
en route to the phloem. We have already shown that the 
inhibition of photosynthesis is caused by a general reduction 
of phPtosynthetic enzymes (Stitt et a/., 1 990) and it might 
be speculated that the expression of proteins needed for 
sucrose transport is decreased in these transformants. 



Sink development is affected by invertase 



hesis, all of the trans- 



In addition_ti 

formants showed inhibited leaf and root development. 
This can be explained because succosej^ cpprt from the 
source leaves will be decreased jn^ll these traasfoanants. 

thai the largesrtrrhibiliuri oHeaf 



However, itis^rfWni 
expansion rate was observed in the cytc^fi^transTomriant, 
and that these leaves were also visually curled and thick. It 
might be speculated that sink leaves on the cytosolic 
transformant cannot store sucrose properly, and that 
nbriiiaj growth is disturbed because of osmotic effects 
exertfd by the high concentrations of fructose in the 
young leaves. 

fh«3 cuiiihg of the leaves might be explained by a higher 
hexose content on the upper side of the leaf (compared to 
the lower sitje), due to higher photosynthetic activity of 
the^evpejtef which would lead to increased water influx, 
resulting in a more rapid cell expansion or even ceil division. 
This ^pwS that disruption of sink-leaf sucrose metabolism 
affecte leaf expansion Mhd development. 

Ih c^riclusidn, we have transfbitiied tdbacco plants with 
yeast invertase and targeted the protein to the cell wall, 
the cytosol, or the vacuole. The transformants show a 
clear development-specific phenotype which varies 
depending on the compartment in which the invertase is 
expressed. There are similarities, but also striking differ- 
ences, in the biochemical phenotype in the three trans- 
formants. These experiments therefore directly demonstrate 
the central importance of compartmentation and protein 
targeting in plant development and metabolism. Further- 
more, these plants provide a novel and powerful tool to 
investigate a variety of physiological problems, including 
the 'sink* regulation of photosynthesis by accumulating 
carbohydrate (von Schaewen et ah, 1990; Stitt et al., 
1990), the co-ordination of the metabolism arid transport 
of sucrose, osmoregulation, and the regulation of leaf 
development. 



Experimental procedures 

Plants, bacterial strains and media 

Nicotiana tabacum L. 'Samsun NN' was obtained through 
'Vereinigte Saatzuchten' (Ebstorf, Germany). Arabidopsis thaliana 
L 'Columbia C24' was kindly provided by J. P. Hemalsteens 
(Vrije Universiteit, Brussels, Belgium). Plants in tissue culture were 
grown under a 1 6 h light/8 h dark regime on Murashige and Skoog 
medium (Murashige and Skoog, 1962) containing 2% sucrose 
(2 MS). Plants used for biochemical analysis were grown in the 
greenhouse as described in von Schaewen et al. (1 990). Escherichia 
coti strain DH5a (Bethesda Research Laboratories, Gaithersburg, 
USA) was cultivated using standard techniques (Maniatis et a/., 
1982). Agrobacterium tumefaciens strain C58C1 containing 
pGV2260 (Debleare et at., 1985) was cultivated in YEB medium 
(Vervliet etal., 1975). 
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Reagents 

PNA restriction and modification enzymes were obtained from 
0oeh ringer Mannheim (Ingelheim, Germany) and New England 
Biolabs (Danvers, USA). Synthetic oligonucleotides were synthe- 
sized on an Applied Biosystems (Foster City, USA) DNA Synthesizer 
(380A). Reagents for SDS-PAGE were purchased from BioRad 
(St Louis, USA). Chemicals were obtained through Sigma Chemical 
Co. (St Louis, USA) or Merck (Darmstadt, Germany). 

Piasmid construction 

In order to obtain cytosolic invertase activity, the 400 bp Asp7ABf 
H/ndlll fragment of the piasmid PI-58-INV (von Schaewen et at., 
1990) was exchanged with a synthetic oligonucleotide (Figure 1). 
this exchange resulted in the removal of the Pl-ll signal sequence 
(Keil et a/., 1 986), while the untranslated leader was unchanged. 
— Vacuolar invertase was constructed using a DNA fragment 
(nucleotides 707-1895) from the patatin genomic clone pgT5 
(Rosahl et a/., 1986) containing parts of the untranslated leader, 
the signal sequence, and also 597 nucleotides coding for the 
mature protein which allows vacuolar targeting of fusion proteins, 
this DNA fragment was fused via a Hin dill-linker to the sue 2 gene 
containing nucleotides +64 to +1765 (Taussig and Carlson, 
1983). The chimeric gene obtained was introduced as an EcoRV/ 
Sph\ fragment into a pJant expression cassette containing the 
35S promoter of cauliflower mosaic virus (CaMV) (nucleotides 
6909-7437; Franck et aL, 1 980) and the polyadenylation signal of 
the octopine synthase gene (OCS). The constructs were cloned 
into the binary vector pBin19 (Bevan, 1984) as Eco RI/H/ndlll 
fragments and directly transformed into Agrobacterium strain 
C58C1:pGV2260. 

Transient expression in Arabidopsis protoplasts 

Isolation, transformation and cultivation of Arabidopsis protoplasts 
was essentially as described by Damm et at. (1 989). The processing 
of the protoplasts and the culture medium for further experiments 
followed the protocol of von Schaewen et aL (1990). 

Cell fractionation of Arabidopsis protoplasts after 
transient expression 

Transient expression experiments were carried out as described 
previously (von Schaewen et a/., 1990). After a 2-3 day cultivation 
period, 4 x 1 0 s cells were pelleted for 5 min at 80 g in a microfuge 
tube. The supernatant was removed with a drawn-out Pasteur 
pipet. The cells were resuspended in 1 00 pJ of cold sucrose buffer, 
which is isotonic for intracellular organelles (12% sucrose [w/w], 
1 mM DTT (DL-dithiothreitol), 1 mM EDTA, 50 mM Tris-HCI 
pH 7.8, 1 (ixj f.c. protease inhibitor cocktail, i.e. Leupeptin, Antipain 
and Pepstatin), and lysed by vortexing in the presence of glass 
beads (five times for 30 sec at room temperature, beads to 
volume ratio 1 :4). The extraction mixture was then subjected to a 
series of centrifugation steps to remove cellular debris (700 g for 
5 min, yielding P1) and large organelles, e.g. chloroplasts and 
mitochondria (10 000 g for 10 min, yielding P2). The supernatant 
of P2 was centrifuged (Beckman 50Ti, Beckman Instruments. 
Palo Alto, USA) for 2 h at 80 000 g, at 4°C to pellet membranes 
arid vesicles (P3). The membrane-free supernatant of P3 was 
referred to as the soluble mixture of both the cytoplasmic and the 
vacuolar cell fractions (Sol). Fractions P2, P3 and Sol were brought 
to the same volume (1 00 |xl) with sucrose buffer and 50 *jlI aliquots 
were subjected to SDS-PAGE. 



Invertase activity 

Detection of yeast invertase after SDS-PAGE arid measurement 
of alkaline invertase activity was as described by von Schaewen 
er a/. (1 990). Proteins were applied to the gel without prior boiling 
in the presence of SDS thus leaving protein aggregates intact and 
active. 



Tobacco transformation 

Transformation of tobacco plants was carried out using the 
Agrobacterium tumefaciens leaf disc technique as described by 
Rosahle/a/.(1987). 



Isolation of plant vacuoles from protoplasts 

Vacuoles were T released from leaf rtiesbphylf protbpfasts by a 
combination of osmotic and thermal shock, using a modification 
of the protocol described by Boiler and Kendo (1979). Pelleted 
protoplasts (3-5 x 1 0 6 ) were chilled on ice shortly before resuspen- 
sion in 6 ml of prewarmed (40°C) lysis medium (0.2 M mannitol, 
10% Ficoll Type 400, 20 mM EDTA, 2 mM DTT, 5 mM Hepes, 
pH 8.0) and transferred into a 1 5 ml Corex tube at room temperature. 
Within 5-10 min, release of vacuoles from the lysed protoplasts 
(followed under the microscope) was complete. The mixture was 
briefly chilled on ice and overtayered with 3 ml of cold 4% Ficoll 
solution (1 volume lysis medium + 1 .5 volumes vacuole buffer) 
followed by 1 ml of cold vacuole buffer (0.45 M mannitol, 10 mM 
Hepes pH 7.5, 1 mM L-cysteine, 1 p,g ml" 1 each of Leupeptin, 
Antipain and Pepstatin). The gradient was centrifuged for 30 min 
at 5000 g in a 'swing-out' rotor at 10°C. Under these conditions 
cellular debris pellets, unlysed protoplasts collect at the 4% Ficoll 
interphase and vacuoles float to the top of the gradient. The 
vacuoles were collected with a Pasteur pipet from the miniscus 
of the 0% Ficoll step and then frozen at -70°C. The yield of 
vacuoles was usually 20% of the initially lysed protoplasts, as 
monitored microscopically by counting and enzymatically by 
a-mannosidase assays (Van derWilden etaL t 1980). To compare 
invertase activity in protoplasts and the vacuolar fraction, equal 
a-mannosidase activities (equivalent to 250 ng a-mannosidase) 
were subjected to SDS-PAGE and stained for invertase activity 
as described previously (von Schaewen et aL, 1990). As a standard, 
a-mannosidase from 'jack bean' (Sigma) was used. 



Measurement of photosynthesis 

Eight leaf discs (each 0.5 cm 2 ) were removed from tobacco plants 
and placed in a leaf disc 0 2 electrode (Hansatech, Kings Lynn, 
UK). Illumination was 600 junol photons m" 2 sec" 1 and saturating 
C0 2 was supplied from 400 m-I 2 M KHC03/K 2 C0 3 buffer pH 9.3 
(Quick et aL, 1989) at 20°C. After 10 min illumination the samples 
were frozen in liquid N 2 under continuing illumination. 

Determination of soluble sugars and starch 

Leaf discs were taken at the indicated times and extracted with 
80% ethanol (1 0 mM Hepes— KOH pH 7.4) at 80°C for 1 5 min. The 
supernatant was used for the determination of glucose, fructose 
and sucrose (Stitt et a/., 1989). The remaining leaf disc material 
was extracted a second time, washed in water and dried. Deter- 
mination of starch was carried out as described by Stitt ef aL 
(1978). 
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Expression of Rice Lectin Is Governed by Two 
Temporally and Spatially Regulated mRNAs in 
Developing Embryos 

Thea A. Wilkfns and Natasha V. RaikheT 

Department of Energy Plant Research Laboratory, Michigan State University, East Lansing, Michigan 48824-1 31 2 

Two cONA clones encoding rice lectin have been isolated and characterized to investigate the expression of rice 
lectin at the molecular and cellular levels. The two cDNA clones code for an identical 23-kilodalton protein which is 
processed to the mature polypeptide of 18 kilodaltons by co-transiational cleavage of a 2.6-kilodalton signal 
sequence and selective removal of a 2.7-kilodalton COOH-terminal peptide which contains a potential Af-linked 
glycosylation site. In addition, the mature 18-kilodatton lectin is post-translationally cleaved between residues 94 
and 95 to yield polypeptides of 10 kilodaltons and 8 kilodaltons, corresponding to the NH3- and COOH-terminal 
portions of the mature subunit, respectively. RNA gel blot analysis established that rice lectin is encoded by two 
mRNA transcripts (0.9 kilobase and 1.1 kilobase). On DNA gel blots, the rice lectin cDNAs hybridize specifically to 
a single restriction fragment In situ hybridization showed localization of the 1.1 -kilobase rice lectin mRNA in root 
caps and specific ceil layers of the radicle, coleorhiza, scutellum, and coleoptile. RNA gel blot analysis demonstrated 
that both the 0.9-kitobase and 1.1 -kilobase mRNAs are present in developing rice embryos. The two lectin mRNAs 
are differentially expressed temporally such that the 1.1 -kilobase lectin mRNA accumulates to levels twofold higher 
than the 0.9-kilobase mRNA. 



INTRODUCTION 



Plant lectins are a class of proteins that bind and cross-link 
specific carbohydrates. Because of their unique carbohy- 
drate-binding properties, lectins are widely used as tools 
in medical cell biology (Lis and Sharon, 1986). Historically, 
plant lectin research has focused on the isolation and 
characterization of new lectin species to broaden the spec- 
trum of specific carbohydrate-binding moieties. Although 
the function of lectins in plants remains obscure, dissecting 
the regulation of expression of lectin genes at the molec- 
ular level should facilitate elucidation of the protein function 

in vivo. 

Many of the Gramineae synthesize A/-acetyigIucosamine 
(GlcNAc)-binding lectins with similar immunological prop- 
erties (Peumans and Stinissen, 1983). These lectins ac- 
cumulate in a cell-type specific manner in various organs 
of developing embryos and young seedlings. Rice lectin, 
initially purified and characterized by Tsuda (1979) from 
rice bran, is a dimeric protein composed of two gtycine- 
and cysteine-rich 18-kD subunits that lack covalently 
bound sugar residues. In the cultivated rice species Oryza 
sativa, the majority of the 18-kD subunits undergo a pro- 

1 To whom correspondence should be addressed. 



teolytic cleavage event which yields two subunits of 8 kD 
and 10 kO (Stinissen, Peumans, and Chrispeeis, 1984). 
This lectin is synthesized as a 23-kD monomeric precursor 
on the rough endoplasmic reticulum (RER) and is subse- 
quently assembled into dinners within the lumen of the RER 
(Stinissen, Peumans, and Chrispeels, 1984). Assembled 
dimers are only transiently associated with the RER before 
being transported to and deposited in vacuoles/protein 
bodies (Stinissen, Peumans, and Chrispeels, 1984). Rice 
lectin accumulates in specific cell layers of the scutellum, 
coleorhiza, radicle, root cap, and throughout cell layers of 
the coleoptile of embryos (Mishkind, Palevitz, and Raikhei, 
1983). 

We are interested in the molecular mechanisms regulat- 
ing cell-specific expression of the Gramineae lectins. Two 
cDNA clones encoding rice lectin have been isolated and 
used to examine the expression of rice lectin in developing 
embryos. In this paper, we present evidence that both 
cDNA clones represent two mRNA transcripts. Each lectin 
mRNA transcript exhibits a distinct pattern of temporal 
expression in developing embryos. Moreover, the cell-type 
specific expression of rice lectin mRNAs is developmental 
and spatially regulated. 
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lectin is a glutamine, which is presumably modified by 
cycfization to pyrrolidone carboxytic acid (Cbapot et al„ 
1 986), a residue resistant to Edman degradation. Initiating 
with glutamine Q + 1, the cDNAs encode a protein com- 
prising 199 amino acids with calculated M r 20,172. How- 
ever, amino acid sequence analysts of the COOH*terminal 
amino acids indicates that mature rice lectin terminates at 
the glycine residue G173 (arrow in Figures 1A and 1B). 
Thus, determination of terminal amino acid residues re- 
vealed that the mature polypeptide of rice lectin comprises 
173 amino acids with M r 17,512 (arrows in Figures 1 A and 
1B and a solid box in Figure 1B). The amino acid compo- 
sition indicates that the mature subunit of rice lectin is a 
glycine- and cysteinenich polypeptide. Cysteine (23%) and 
glycine (19.7%) together account for almost 43% of the 
mature polypeptide amino acid composition. 

In adcfition to the signal sequence and the mature rice 
lectin subunit, the cONAs encode proteins with an addi- 
tional 26 amino acids {M r 2678) extending beyond the 
COOH terminus of mature rice lectin (boxed residues in 
Figure 1 A, stippled box in Figure 1 B). This COOH-terminal 
extension is a relatively hydrophobic domain and contains 
a potential AMinked gJycosylatlon site at asparagine residue 
N179 (asterisk in Figures 1 A and 1B). Rice lectin is there- 
fore synthesized as a preproprotein that requires the pro- 
teolytic removal of the signal sequence and post-transla- 
tional processing of a COOH-terminal domain to yield the 
mature polypeptide. In vacuoles, the mature 1 8-kD subunit 
polypeptide undergoes additional post-transiational proc- 
essing to yield two smaller polypeptides of approximately 
10 kD and 8 kD (Stinissen, Peumans, and Chrispeels, 
1 984). To resolve the relationship between these polype^ 
tides and the protein encoded by the cDNAs, both poly- 
peptides were purified and subjected to NHz-terminal and 
COOH-terminal amino acid sequence analyses. Results 
from these analyses indicate that the mature subunit of 
rice lectin is proteolyticaily cleaved between amino acids 
residues N94 and G95 as deduced from the cDNA clones 
(open arrowhead, Figure 1 A). The resultant 10-kD and 8- 
kD polypeptides correspond to the NH^ and COOH-ter- 
minal portions of the mature 18-kD protein, respectively. 

A comparison of amino acids from rice lectin and isolec- 
tin B of wheat germ agglutinin (WGA-B) is presented in 
Figure 2. Rice lectin exhibits 73% identity with WGA-B 
(boxed amino acids in Figure 2) within the cocfing region 
of the mature subunits spanning from glutamine Q + 1 to 
glycine G171 in WGA-B or glycine G173 in rice lectin. The 
overall homology between the two lectins increases to 
79.5% when conserved amino acid changes (asterisks in 
Figure 2) are included in the comparison. Both rice lectin 
and WGA-B require the post-translational processing of 
COOH-terminal domains to produce the mature 18-kD 
subunit. Alignment of the 26-amino acid COOH-terminal 
domain from the proprotein of rice lectin and the 1 5-amino 
acid COOH-terminal domain from pro-WGA-B for maximal 
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Figure 2. Comparison of Amino Acid Sequences between Rice 
Lectin and I sotectin B of Wheat Germ Agglutinin (WGA-B). 

The complete deduced amino acid sequence of rice lectin was 
aligned for maximal homology to the available amino acid se- 
quence of WGA-B (Raikhel and WBkins. 1987). Identical amino 
acids are depicted by boxed residues, whereas conserved amino 
acid changes between the two lectins are denoted by asterisks. 



homology shows a 46.7% overall amino acid conservation, 
indicating that this region is less conserved than the coding 
region of the mature protein. 



Rice Lectin Is Encoded by Two Different mRNAs 

To explore the relationship between the two cDISIA clones 
encoding rice lectin, an RISIA gel Wot containing total RNA 
from developing rice embryos [10 days to 20 days post- 
anthesis (DPA)] was probed with labeled insert from 
CRL852 or cRL1 035. Two mRNA species of approximately 
1 .1 kb and 0.9 kb were identified (Figure 3A). Therefore, 
the two cDISIA clones correspond to the two mRNAs and 
did not arise from cloning artifacts. To discriminate expres- 
sion due solely to the 1.1 -kb mRNA species, a done- 
specific probe (CRL165) encompassing 1 65 bp of the 3'- 
untranslated region unique to CRL1035 was constructed 
(see Figure 1B). The specificity of cRL165 as a clone- 
specific probe for the cDNA cRLI 035 and the 1 .1 -kb lectin 
mRNA transcript was confirmed by DNA gel and RNA gel 
blot analyses, respectively (data not shown). 

To determine the number of rice lectin genes responsible 
for the two mRNAs, DNA gel blots containing restricted 
genomic DNA were hybridized with ^-labeled inserts from 
cDNA clones cRL862, CRL1035, or CRL165. Figure 3B 
shows a representative DNA gel blot depicting single re- 
striction fragments of 1 1 kb and 15 kb detected in EcoRI- 
and Hindlll-digested genomic DNA (cv. IR36), respectively. 
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METHODS 



Plant Material 

Developing rice {Oryza sativa cv. Lemont) embryos were coflected 
from spikes harvested at 5 DPA, 10 DPA, 20 DPA, 30 DPA, and 
40 DPA from plants maintained under greenhouse conditions. 
Embryos used for in situ hybridization experiments were proc- 
essed immecfiatety, white the bulk of collected embryos (10, 30, 
40 DPA) were quick frozen in fiqukS nitrogen and stored at -80°C 
for RNA isolation. 

Young seedlings of the rice cuJtivars Nato or IR36 were ger- 
minated and grown in Baccto professional potting mix in a growth 
chamber with a 12-hr tight period at 27°C and a 12-hr dark period 
at 21 °C, with 70% humidity. Shoots of 10-day-old seedlings were 
collected and frozen in liquid nitrogen for isolation of total DNA. 



Screening of a XgtIO cDNA IJbrary for Rice Lectin 

A Xgt10 cDNA library constructed from pofy(A) + RNA isolated 
from spikes of rice O. sativa cv. Nato was provided by Susan 
Wessler and Ron Okagaki (University of Georgia, Athens, GA). 
Approximately 160,000 recombinant phage were grown on Esch- 
erichia coti CSOOhfl at a density of 40,000 per 150-mm Petri plate 
and replicated onto nitrocellulose filters as described in Maniatis, 
Fritsch, and Sambrook (1982). The nitrocellulose filters were 
hybridized with a ^P-random primer-labeled cDNA insert (Fein- 
berg and Vogeistein, 1983) from done WGA-B (clone pNVR1 
described in Raikhel and Wilkins, 1987) for 18 hr in 6 x SSC, 5 x 
Denhardfs solution, 0.2% SDS, and sonicated salmon sperm 
DNA at 5 ng/mi. Post-hybridization washes included three 15-min 
washes at room temperature and two 1 5-min washes at 60°C in 
3 x SSC, 0.1% SDS. Positive phage were plaque-purified to 
homogeneity (Maniatis, Fritsch, and Sambrook, 1982) under high 
stringency screening conditions using a ^P-iabeled insert from 
WGA-B (Raikhel and Wilkins, 1987). 



DNA Nucleotide Sequence Analysis 

Inserts, designated cRL852 and CRL1035, were purified from 
selected phage by electrophoresis in low-melting-point agarose 
(StruM, 1 985) and cloned into pUC1 1 9 (Vieira and Messing, 1 987) 
in both orientations for subsequent DNA sequence determination. 
A sequential series of overlapping deletions from both strands of 
the cDNA was generated by T4 DNA polymerase (Dale and Arrow, 
1987) from fuNength, single-stranded DNA templates (vieira and 
Messing, 1987). Single-stranded deletion templates were se- 
quenced by the dideoxynucleotide chain termination method (San- 
ger, Nicklen, and Coulsen, 1977) using 3S S-dATP and 7-deaza- 
dGTP instead of dGTP (Mizusawa, NisWmura, and Seela, 1986). 
Computer alignment of overlapping deletions, and amino acid and 
sequence analysis were performed using Microgenie software 
(Beckman). 

A fortuitous deletion encompassing the terminal 165 bp of 3'- 
untranslated region unique to CRL1035 was retrieved for use as 
a clone-specific probe. This partial cDNA clone was maintained in 
PUC119 and given the designation cRL165. 



Amino Acid Sequence Determinations of NHr and COOH- 
Termtnal Amino Add Residues of Rice Lectin 

Rice lectin was purified from 10 g of mature rice embryos (cv. 
IR36) via affinity chromatography on immobilized W-acetylgJuco- 
sarrdne (SeJectin 1 , Pierce) accorcfing to the procedure detailed in 
Mansfield, Peumans, and Raikhel, (1988). To enhance resolution 
of the rice lectin during SDS-PAGE on a 15% poryacryiamkte gel 
(Laerrtmfi, 1970), the purified protein was S-carboxyamidated at 
37°C for 30 min in the presence of 240 mM iodoacetamide 
(Raikhel, Mishkind. and PaJevitz, 1984) prior to electrophoresis. 
Individual subunits (8 kD and 10 kD) of rice lectin were visualized 
by staining the gel briefly (10 min) in Coornassie blue, followed by 
destaining in 30% methanol, 7.5% acetic acid. The 8-kD and 10- 
kD polypeptides were excised from the gel, eiectroefuted in Laenv 
mB (1 970) buffer, and lyophOized. SDS was removed from protein 
by the organic extraction method of Konigsberg and Henderson 
(1983). Removal of salts from the protein was accomplished by 
dialysis against 15% acetic acid at 4°C in the dark for 2 days 
prior to tyophSization. 

Approximately 200 pmol of gel-purified rice lectin was applied 
to a Model 477 Sequenator equipped with a 120 on-line PTH- 
amino acid analyzer (Applied Biosystems, Inc.) for determination 
of NHrtermina! amino acid residues. The terminal amino acids of 
the COOH terminus were determined by carboxypeptidase Y 
digestion of 500 pmol of rice lectin via the procedure of Hayashi 
(1977). The identification and quantitation of free amino acids in 
digestion mixtures were accomplished by HPLC analysis using 
precotumn derivatization with o-phthaldialdehyde. Amino acid se- 
quence determinations were performed at the Protein Chemistry 
Facffity, University of California, Irvine. 

RNA Gel Blot Analysis 

Total RNA was isolated from 50 mg to 1 50 mg of developing rice 
embryos via the hot phenol method of Finkelstein and Crouch 
(1986) with the addition of 1% 2-mercaptoethanol to the homog- 
enizatkxi buffer. RNA gel blots were prepared from 25 /xg of RNA 
for each developmental stage of embryos and hybridized with 
rcmdcmri-primer-labeled CRL852 insert under stringent conditions 
(Raikhel, Bednarek, and WiBcins, 1988). Blots were exposed to 
Kodak XAR-5 film with Intensifying screens at -80 D C for 10 hr to 
1 5 hr. Autoradiograms were scanned with a Gilford densitometer. 

Gene Reconstruction Analysis 

Total DNA was isolated from 10-day-old rice (cv. IR36 or Nato) 
seedlings according to Shure, Wessier, and Fedoroff (1983) and 
restricted to completion with EcoRI, Hindlll, Kpnl, Smal, or Xbal. 
Two micrograms of digested DNA (3.3 x 1 0* genome equivalents) 
and 0.5-copy, 1 .0-copy. and 3.0-copy equivalents of the cRL1 035 
cDNA clone were fractionated by agarose gel electrophoresis and 
transferred to nitrocellulose (Maniatis, Fritsch, and Sambrook, 
1 982). Gene copy reconstructions were based upon a rice genome 
size of 5.47 x 10 s kb per haptokJ genome (Francis, Kidd, and 
Bennett. 1985). Hybridization and post-hybridization washes of 
the reconstruction blot were performed as described for RNA gel 
blots with the exception that random-primer radiolabeled insert 
from cRL1 035 was used as a probe. 
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Abstract 

Two strains of Streptomyces venezuelae were found to produce high-affinity, biotin-binding proteins, termed streptavidin vl and v2, 
respectively. Both proteins were isolated to purity, and their corresponding genes were cloned and sequenced. Compared to streptavidin 
from S. avidinii, streptavidin vl had only a single amino acid substitution and streptavidin v2 showed 9 such differences. The 
substitutions were remarkably conservative, none of which affected the amino acid residues known to be important to the biotin-binding 
properties or to the structure of the tetrameric protein. The results also indicate that the biosynthesis of such biotin-binding proteins is not 
simply a curious anomaly in a single species of Streptomyces, It is suggested that the classification of S. avidinii as a unique species 
should be reconsidered. The occurrence of these proteins appears to be linked to the production of an unusual synergistic antibiotic 
complex. 

Keywords: Streptavidin; Sequence comparison; Biotin-binding protein; (Streptomyces) 



1. Introduction 

Egg-white avidin and bacterial streptavidin are two very 
similar biotin-binding proteins with very similar properties. 
In our early studies on the structure-function relationship 
of these two proteins, we performed chemical modification 
experiments [1-3] which were later supplemented by X-ray 
analysis [4]. Such studies indicated that there may be litde 
room for drastic changes in their structure, since chemical 
modifications were usually associated with a loss of bind- 
ing. This notion became even clearer when we sequenced 
an antibiotin antibody and found combining-site motifs 
similar to those of avidin and streptavidin [5]. 

However, chemical modification usually introduces 
bulky moieties into a protein, and the final conclusions 
emanating from such studies are often obscured. In order 
to overcome this problem, modification of amino acids 
which do not increase the size of a target amino acid are 
required. This can be accomplished by two different ap- 
proaches: (i) by substituting specified amino acids by 
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site-directed mutagenesis, and (ii) by comparing the struc- 
tures of other biotin-binding proteins in nature. 

We have decided to apply both approaches in our future 
studies. We begin with the second approach, under the 
premise that if significant changes in the binding site are 
detected, this would suggest which amino acids are suit- 
able candidates for site-directed mutagenesis. If we find to 
the contrary, this would suggest that nature is indeed very 
conservative and any change in the binding site would 
probably lead to a reduction in the affinity for biotin. 

The original discovery of streptavidin indicated that this 
protein forms part of a peculiar synergistic antibiotic com- 
plex in a novel species of Streptomyces, named S. avidinii 
[6,7]. More recently [8], a similar type of antibiotic activity 
was described in two strains of a different species (5. 
venezuelae strains Tti 2460 and Tu 2605). We therefore 
examined a variety of Streptomyces strains and found that 
only the two S. venezuelae strains which produce the 
antibiotic also produce biotin-binding proteins which are 
similar to streptavidin from S. avidinii. The respective 
proteins were termed streptavidin vl and v2. Determina- 
tion of their sequences revealed a very close correlation to 
the original streptavidin, indicating a strict conservation of 
the residues for biotin binding and protein structure. 
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2. Materials and methods 

2.L Bacteria 

Streptomyces venezuelae strains TU 2460 and Hi 2605 
were generous gifts of Professors H. Zahner and G. Jung 
of Tubingen University, Germany. Type strains of S. 
avidinii ATCC 27419, 5. cattleya NRRL 8057, S. 
clavuligerus ATCC 27064 (DSM 738), 5. griseus ATCC 
10137, S. jwnonijinensis NRRL 5641, S. lavendulae ATCC 
8664 and S. venezuelae ATCC 10712 were obtained from 
the respective culture collections. 

S. lividans 66, 5. lactamdurans MA2908, 5. coelicolor 
A3 17, and Streptomyces strain aleph 50 were supplied by 
Y. Aharonowitz of Tel Aviv University, Israel. 

2.2. Materials 

Restriction and modifying enzymes were purchased 
from Boehringer-Mannheim Biochemica (Mannheim, Ger- 
many) or from Promega (Madison, WI, USA). The DNA 
markers comprised the 1 kb ladder from Life Technologies 
(Gaithersburg MD, USA). Sequenase V2.0 was a product 
of United States Biochemical (Cleveland, OH, USA). Es- 
cherichia coli XLl-Blue was from Stratagene Cloning 
Systems (LaJolla, CA, USA). Radiochemicals and nylon 
filters (Hybond-N, 0.45 Aim) were obtained from Amer- 
sham International (Buckinghamshire, UK). Common lab- 
oratory reagents, biochemicals and chemicals were ob- 
tained from Sigma (St. Louis, MO, USA) or from E. 
Merck (Darmstadt, Germany). 

2.3. Hybridization 

For genomic blotting, chromosomal DNA from differ- 
ent strains of Streptomyces was prepared as described 
previously [9]. The samples were digested for 4 h using the 
desired enzyme (e.g., BamHl, PstI, SmaU and Kpnl). 

The DNA digests were loaded on a 1% agarose gel 
equUibrated with TAE buffer (40 mM Tris-acetate and 1 
mM EDTA, pH 8.0). Following electrophoresis (16 h, 7 
mA), the DNA in the gels were denatured by 0.5 M NaOH 
in 1.5 M NaCl, neutralized using 1 M Tris (pH 7.4) in 1.5 
M NaCl, and transferred to nylon filters [10]. The samples 
were crosslinked to the membrane by ultraviolet irradiation 
at 254 nm using a Stratagene crosslinker at an intensity of 
120 mJ cm" 2 . 

Two oligonucleotide probes were designed. Probe 2492 
(5'-ATG CAT ATG CGC AAG ATC GTC GTT GCA 
GC-3') corresponded to positions 50-72 of the streptavidin 
gene [11] with an Ndel restriction site attached at the 
N-terminus, and probe 2493 (5'-CTT AAG CTT CTA 
CTG AAC GGC GTC G-3' corresponded to positions 
583-601 with a Hindm site at the C-tenninus. Hie oligo- 
nucleotide probes were labeled at the 5' end using polynu- 
cleotide kinase and [-y- 32 P]A*TP [12]. 



Blots were prehybridized at 68°C for 4 h with a mixture 
of 5 X Denhardfs, 5 X SSC, 0.5% SDS and 100 Mg/ml 
denatured salmon sperm DNA. Hybridization was accom- 
plished using the same treatment with 16 h incubation at 
42°C. The blots were washed twice for 30 min at 25°C 
with 2 X SSC plus 0.1% SDS and similarly with 0.1% 
SSC plus 0.1% SDS. 

2 A. Cloning of streptavidin genes 

DNA samples from 5. avidinii or from 5. venezuelae 
Tu 2460 were digested with BamltL, and the fragments 
were separated on 1% low-melting agarose gels. Frag- 
ments 2-2.5 kb in length were separated and ligated into 
BamHI-digested plasmids (pGEM and pUC18, respec- 
tively). DNA from S. venezuelae Tu 2605 was digested 
with PstI and fragments of ~ 5000 kb were excised from 
the gels and ligated into pUC18 plasmid. Escherichia coli 
XLl-Blue was used as a host Probes 2492 and 2493 were 
used for screening. Prehybridization and hybridization 
temperatures were 68°C and 42°C, respectively. 

2.5. Sequencing 

Nucleotide sequences were determined by the 
dideoxynucleotide chain-termination method [13] using Se- 
quenase (United States Biochemical) according to the pro- 
tocol supplied by the manufacturer. To complete the se- 
quences, various synthetic oligonucleotide sequencing 
primers were used to cover both strands. 

2.6. Detection of biotin-binding activity 

The presence of biotin-binding proteins in the culture 
medium of the various strains of Streptomyces was deter- 
mined colorimetrically on microti ter plates as described 
previously [14]. 

2.7. Preparation of streptavidin and analogs 

Cells of S. avidinii, S. venezuelae TU 2460 or 2605 
were grown on malt medium, and the biotin-binding pro- 
teins in the culture broth were purified using an iminobi- 
otin-Sepharose resin [15]. The purified proteins were dia- 
lyzed against distilled water and lyophilized. The biotin-bi- 
nding activity of a weighed sample of each protein was 
determined by the HABA method [16]. 



Table 1 



Isolation of biotin-binding proteins from Streptomyces strains 



Strain 


Protein 


Yield (mg/1) 


S. avidinii 


streptavidin 


12.5 


S. venezuelae TU 2460 


streptavidin vl 


24.0 


S. venezuelae TU 2603 


streptavidin v2 


22.0 
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§. Iividans -| 

S. tactamcfurans - f 

S. venezuelae TO 2460 -jMBB^MBBBBBBBMi^BWWi 

5. venezuelae TO 2605 -HMH^^nHBD^HnHBHI 

S. venezuelae ATCC 10712 - 
S. lavendulae 
S. Jumonjlnensis 

S. coellcolor . 

S. clavuligerus - 

S. griseus - 

S. cattleya - 

S. a/ep/7 50 - ^ 

0.0 0.1 0.2 0.3 0.4 

Biotin-binding activity (A 410 ) 

Fig. 1. Secretion of biotin-binding activity in various species of Streptomyces. The designated strains were grown on malt medium and the cell-free growth 
medium was assayed for biotin-binding activity by an ELISA-like colorimetric assay system as described in the text. 



2.8. Miscellaneous methods 

SDS-PAGE of the streptavidins was performed on boiled 
samples using 15% gels according to Bayer et al. [15]. The 
gels were stained with Coomassie brilliant blue R-250. 

The concentration of the purified protein in solution 
was estimated spectrophotometricaliy (e^o — 3.0). 



M Sa 2 3 Sv Sv 7 8 
2606 2460 



9 11 12 13 




3* Results 

3.1. Presence of biotin-binding proteins in Streptomyces 
cultures 

Various strains of Streptomyces were grown under 
identical conditions. Following a 7-day growth period, the 



Sa Sav1 Sa v2 nSa cSa 




nSa- 



cSa a 



Fig. 2. Comparative SDS-PAGE profiles of the streptavidin preparations 
from S. avidinii and S. venezuelae. Cultures of the desired strains were 
grown, the respective biotin-binding proteins were purified by affinity 
chromatography on iminobiotin columns and subjected to SDS-PAGE. 
Sa, streptavidin from S. avidinii; Sa vl, streptavidin vl from S. venezue- 
lae strain TU 2460; Sa v2, streptavidin v2 from S. venezuelae strain Tu 
2605; nSa, standard for native (full complement) streptavidin; cSa, stan- 
dard for core (proteolysed) streptavidin. Note: in these particular batches, 
all of the proteins were degraded to some degree by resident proteinases 
in the culture medium. Compare with the native and core streptavidin 
standards (Af r 16600 vs. 13200, respectively). The experimental sample 
of streptavidin showed 4 bands; the smallest being slightly larger than 
core streptavidin. The native bands for streptavidins vl and v2 both 
migrated at very slight, but consistently different rates than that of 
streptavidin, suggesting slight differences in their primary structures. 




Fig. 3. Southern hybridization of genomic DNA from various species of 
Streptomyces. (Top) Agarose gel electrophoresis of Bam HI restriction 
digests of the indicated genomic DNAs. (Bottom) Southern hybridization 
using oligonucleotide probe based on the streptavidin gene of S. avidinii. 
Lanes: M, DNA markers, 5.o, S. avidinii; 2, S. Iividans; 3, S. lactamdu- 
rans; &v 2605, S. venezuelae TU 2605; S.v 2460, S. venezuelae TU 
2460; 7, S. lavendulae; 8, 5. jumonijinensis; 9, S. coelicolor; 11 S. 
griseus; 12, S. cattleya; 13, Streptomyces aleph 50. 
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cell-free culture medium was examined for biotin-binding 
activity [171 The results (Fig. 1) indicated that only 5. 
avidinii, 5. venezuelae Hi 2460 and 2605 exhibited such 
activity. In contrast, the type strain (ATCC 10712) of 5. 
venezuelae did not contain such activity. Nor did the type 
strain (ATCC 8664) of S. lauendulae. Thus, only those 
cultures that were previously demonstrated to produce 
Acm-containing peptides showed biotin-binding activity. 

The extent of biotin-binding activity in the three antibi- 
otic-producing strains seemed quite similar, and we were 
interested in determining whether this activity reflected the 
presence of streptavidin-like biotin-binding proteins. We 
therefore subjected the cell-free growth medium from each 
strain to an affinity chromatographic procedure, designed 
to isolate streptavidin [18]. The purified proteins were 
examined for biotin-binding activity. 

Following purification, the relative amounts of protein 
obtained from the cultures of both S. venezuelae strains, 
were found to be similar to that of S. avidinii (Table 1). In 
each case, SDS-PAGE analysis (Fig. 2) of the purified 
proteins revealed a high-molecular- weight band (ca. 17 
kDa) for the monomer, which is considered to correspond 
to a native form of each protein as demonstrated previ- 
ously for streptavidin [15]. In addition, the presence of one 
or more lower molecular-weight bands could be discerned, 
which reflect proteolytic breakdown products [19]. The 
mobilities of the bands in the 5. venezuelae forms also 
differed slightly from that of S. avidinii. 

In the case of the reference streptavidin sample pre- 
pared in this experiment, there appeared to be a higher 
level of endogenous proteinases in the culture, which led 
to a livelier degradation of the terminal appendages in the 
molecule. As documented earlier [15], the resultant low- 



molecular-weight form is larger than core streptavidin by 4 
amino acid residues, and this difference is detectable on 
SDS-PAGE gels. 

3.2. Streptavidin-like genes in S. venezuelae Tu 2460 and 
2605 

Hie observed differences in the mobility profiles of the 
streptavidin-like proteins in the three strains suggested that 
the respective genes may also differ. We therefore synthe- 
sized two probes, based on the known sequence of the 
streptavidin gene. Genomic DNA from each of the strains 
was prepared and digested with a series of restriction 
enzymes. The Southern hybridization pattern of the BamHi 
digests is shown in Fig. 3. 

As can be seen from the figure, distinctive labeling was 
obtained for the digest of £ avidinii and the two 5. 
venezuelae TU strains. Thus, in accordance with the results 
observed for biotin-binding activity, the only strains which 
were labeled by the gene probes were the three strains 
which are known to express the Acm-based antibiotic. 
None of the other strains, including the S. venezuelae type 
strain (ATCC 10712), interacted with the probes. 

Interestingly, the labeled band derived from S. venezue- 
lae Tu 2460 was much closer in size to that of S. avidinii 
than to the labeled band from strain Tu 2605. This was 
true for all of the digests prepared using other restriction 
enzymes, e.g., Kpnh Smal and Rsal as well (data not 
shown). It seems that the proximity of the gene from S. 
venezuelae strain Tu 2460 more closely resembles that of 
S. avidinii than that of strain Tu 2605. This observation 
was confirmed by cloning and sequencing the two genes 
(Fig. 4). Indeed, the sequences adjacent to the streptavidin 
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Rfr 4. Sequence feme* of the streptavidin genes from S. avidinii and S. v tnew lae. Hyphens represent nucleotides in the respective genes for 
str^tavKJm vl and v2 (Sa v l and Sa v2) which are identical to those of the original streptavidin gene (Sa). Nucleotides that differ at a given position are 
indicated. Start and stop codons are marked by asterisks (•"•). *^ 
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Fig. 5. Comparison of the deduced amino acid sequences of streptavidin 
vl and v2 (Sa vl and Sa v2) with the known primary structure of 
streptavidin (Sa). Amino acids known to contribute to the biotin-binding 
site in streptavidin are designated by boxes. The positions of the )8 
structures (numbered) are shown by arrows. Residues which comprise the 
signal peptide are assigned negative numbers. Vertical arrows denote the 
N-and C -terminal proteolytic sites which delimit the stable streptavidin 
*core\ 



vl gene (from 5. venezuelae strain Tu 2460) were nearly 
identical to those of the original streptavidin gene. On the 
other hand, the sequences immediately upstream and 
downstream to the streptavidin v2 gene (from strain TU 
2605) showed many more nucleotide substitutions. 

These comparative differences were commensurate 
within the corresponding genes themselves, i.e., the 
streptavidin vl gene and the original streptavidin gene 
were much more similar between themselves than either 
was to streptavidin v2. Thus, streptavidins vl and v2 
showed 6 and 42 nucleotide substitutions, respectively, 
compared to the original streptavidin gene. 

The deduced amino acid sequences of the streptavidins 
are presented in Fig. 5. Compared to the primary structure 
of streptavidin, streptavidin vl showed only a single amino 
acid substitution at position 100 -a threonine instead of 
alanine. This residue is located within a loop which con- 
nects 0-strands 6 and 7 of the /J-barrel [20]. The residue is 
in an exposed position and is important neither to the 
structure of the protein nor to its biotin-binding function. 

In streptavidin v2, 9 amino acid substitutions were 
apparent. One was in the signal peptide. Three more were 
in the extraneous N-terminal segment which is frequently 



cleaved proteolyticaily during growth of the bacterium and 
postsecretory processing of the molecule. Consequently, 
within the reputed streptavidin v2 core protein (residues 14 
to 139), only 5 amino acid substitutions were evident, 
compared to the known streptavidin core sequence. Like 
streptavidin vl, one of the substitutions occurred at posi- 
tion 100. Likewise, the other 4 substitutions are relatively 
conservative ones which occur in relatively unimportant 
residues located in exposed loops which interconnect fi- 
strands. The residues of the 0-strands per se are unaltered. 
All of the amino acid residues known to participate in the 
binding of biotin are likewise conserved. The only 'excit- 
ing* substitution is the replacement of alanine for glutamic 
acid at position 116, which would presumably result in a 
protein with a higher pL 



4. Discussion 

In a sense, this work is a retrospective one. Three 
decades ago [7], a new synergistic antibiotic activity was 
described in a new species of Streptomyces* termed S. 
avidinii* which led to the discovery of the biotin-binding 
protein streptavidin [21]. Thus, streptavidin combines syn- 
ergistically with the 'stravidins* [22] -namely, a group of 
di-or tri-peptides which contain an unusual amino acid, 
called amiclenomycin (Acm), which acts as a biotin an- 
timetabolite [23,24]. 

Since these initial works, interest in this protein has 
assumed a practical nature, and the distinction of strepta- 
vidin has shifted to that of a preferred replacement for 
egg-white avidin in avidin-biotin technologies [25-27]. 

All but lost in the interim years was the fact that in the 
original work, similar antibiotic activity was also described 
in 1 1 additional strains [6]. All of these strains could be 
classified in a single, well-known species - S. lavendulae. 
The original authors, in their zeal to analyze the antibiotic 
activity from the newly classified S. avidinii, disregarded 
further analysis of the activities in the S. lavendulae 
strains. It is unfortunate that the latter strains are now 
unavailable. The question still stands as to the nature of 
putative streptavidin-like proteins in these strains. 

For this reason, the more recent discovery [8] of the 
Acm antibiotic in two strains of S. venezuelae was intrigu- 
ing to us. In particular, we were interested in re-addressing 
the question of whether new biotin-binding proteins are 
produced in species that express this type of antibiotic. If 
so, we wanted to isolate and characterize these proteins 
and their genes. 

Our findings indeed indicate that the occurrence of 
streptavidin-like proteins -albeit unusual -may be more 
commonplace in Streptomyces than hitherto believed. In- 
terestingly, the ATCC-derived 'type strains' of both S. 
lavendulae and S. venezuelae do not produce such biotin- 
binding proteins and their genomic DNAs fail to hybridize 
with streptavidin-speciric probes. It thus appears that the 
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production of streptavidin-like proteins is not species- 
specific. I n stead, the harboring of a streptavidin-like gene 
in an individual strain of Streptomyces and the expression 
of its product appear to be directly connected to its produc- 
tion of the synergistic Acm-based antibiotic activity. 

Both traditional taxonomic methods [28] and ribosomal 
protein analysis [29] have indicated that S. avidinii and S. 
lavendulae are very closely related species. In fact, Ochi 
[30] suspects that the two species might be combined as a 
single taxon. On the other hand, S. venezuelae belongs to a 
different cluster group according to numerical classifica- 
tion [28] and appears to be more diverse than S. avidinii 
and S. lavendulae. 

In this context, it is interesting that one of the strepta- 
vidins from S. venezuelae is more closely related to the 
original streptavidin than to its intraspecies cognate. In any 
case, the observed substitutions are certainly conservative 
ones and have little bearing on the chemical, structural and 
biotin-binding properties of the resultant protein molecule. 

We have recently elucidated the three-dimensional crys- 
tal structure of egg-white avidin [4], and both its fold and 
the binding-site residues which are essential to binding 
biotin were compared with those of streptavidin [20,31]. 
When we initiated the present studies, we had hoped that 
the sequence of such streptavidin-like proteins would pro- 
vide us with insight from nature into the types of muta- 
tions which would be worthwhile to perform on such a 
protein. Instead, nature seems to be telling us that there 
may be very little room for change in the amino acid 
sequences of these proteins in order to bind biotin. 

The overall similarity in the sequences of egg-white 
avidin and core streptavidin is about 35%, but the similar- 
ity in the biotin-binding pocket is approx. 90%, indicating 
the importance of selected residues to the binding of 
biotin. The consequence of the differences may be the 
observed reduction of two orders of magnitude in the 
binding affinity, in favor of the egg-white protein [32]. The 
necessity of certain binding-site residues was also implied 
in our recent studies on antibiotin antibodies [5], in which 
binding-site motifs similar to those of the binding sites of 
avidin and streptavidin were found, although the binding 
affinity of the antibody for biotin is significantly lower. 
Markedly reduced biotin-binding activities were also 
demonstrated recently on peptide segments isolated from 
avidin [33] and in a library of peptides synthesized by 
bacteriophages [341 The logical conclusion from these 
studies is that mutations in the binding site residues of 
avidin or streptavidin will lead to a weaker binding of 
biotin. These studies also suggest that the strong binding to 
biotin may play a major biological function, although its 
exact role in nature and its implications to the antibiotic 
activity (other than complete inactivation thereof) remains 
a mystery. 

In conclusion, the streptavidins from S. avidinii and S. 
venezuelae are very similar molecules. They bind biotin 
similarly and to the same extent, and are expected to 



display the same type of three-dimensional fold. Since 
streptavidins are produced by several related species of 
Streptomyces, the original classification of S. avidinii as a 
separate species should be reevaluated. The discovery, 
cloning and sequencing of additional streptavidin-like genes 
together with site-directed mutagenesis studies should shed 
further light on the status of such biotin-binding proteins. 
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Biotinidase (EC 3.5.1.12) catalyzes the hydrolysis of 
biocytin, the product of biotin-dependent carboxylase 
degradation* to biotin and lysine. Biotinidase deficiency 
is an inherited metabolic disorder of biotin recycling 
that is characterized by neurological and cutaneous ab- 
normalities, and can be successfully treated with biotin 
supplementation. Sequences of tryptic peptides of the 
purified human serum: enzyme were used to design oli- 
gonucleotide primers for polymerase chain reaction am- 
plification from human hepatic total RNA to generate 
putative biotinidase cDNA fragments. Sequence analy- 
sis of a cDNA isolated from a human liver library by 
plaque hybridization with the largest cDNA probe re- 
vealed an open reading frame of 1629 bases encoding a 
protein of 543 amino acid residues, including 41 amino 
acids of a potential signal peptide. Comparison of the 
open reading frame with the known biotinidase tryptic 
peptides and recognition of the expressed protein en- 
coded by this cDNA by monoclonal antibodies prepared 
against purified biotinidase demonstrated the identity 
of this cDNA. Southern analyses suggested that biotini- 
dase is a single copy gene and revealed that human 
cDNA probes hybridized to genomic DNA from mam- 
mals, but not from chicken or yeast. Northern analysis 
indicated the presence of biotinidase mRNA in human 
heart, brain, placenta, liver, lung, skeletal muscle, kid- 
ney, and pancreas. 



Biotinidase (EC 3.5.1.12) catalyzes the release of biotin, an 
essential B-complex vitamin, from biocytin, the degradative 
product of the four biotin-dependent holocarboxylases, pyru- 
vate carboxylase, acetyl-CoA carboxylase, propionyl-CoA car- 
boxylase and B-jn e thy lcro tony 1-CoA carboxylase, and from pro- 
teolytically degraded dietary proteins (1). Mammals cannot 
synthesize biotin and, therefore, must obtain the vitamin from 
their diet and from recycling endogenous biotin. 

Biotinidase deficiency, an autosomal recessive disorder, re- 
sults in a secondary biotin deficiency that leads to multiple 
carboxylase deficiency (2). Clinical features of untreated indi- 
viduals with profound biotinidase deficiency (<10% of mean 
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tional Institutes of Health (to B. W.). The costs of publication of this 
article were defrayed in part by the payment of page, charges. This 
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mond, VA 23298. Tel.: 804-786-9632; Fax: 804-786-3760. 



normal activity) include seizures, hypotonia, skin rash, alope- 
cia, developmental delay, conjunctivitis, visual problems, hear- 
ing loss, metabolic ketolactic acidosis, organic acidemia, and 
hyperammonemia (1). There is variability in the expression of 
these features and in the age of onset of symptoms, even within 
the same family (3). Because biotin therapy, when initiated 
early, prevents many clinical and biochemical symptoms of this 
disorder (1), newborn screening for biotinidase deficiency has 
been implemented in many states in the United States and in 
many countries (4). 

We now describe the cloning and sequencing of the cDNA 
encoding normal human biotinidase. The tissue distribution of 
the biotinidase mRNA, copy number, and preliminary analysis 
of conservation of the genomic gene for biotinidase were deter- 
mined. 

EXPERIMENTAL PROCEDURES 

General Methods — Standard methods were performed (5), except 
where noted. Restriction endonucleases and DNA modifying enzymes 
were obtained from Life Technologies, Inc. Membranes were prehybrid- 
ized in 5 x SSC, 5 x Denhardt's reagent (0.1% Ficoll, 0.1% polyvinyl- 
pyrrolidone, 0.1% bovine serum albumin), 2% SDS, and 100 pg/ml soni- 
cated salmon sperm DNA at 60 °C for 4 h. Hybridizations were 
performed in fresh prehybridization solution with 10% dextran sulfate 
at 60 °C for 14-18 h. The final stringency of washing conditions was 0.2 
x SSC, 1% SDS at 60 °C for 30 min, except where noted. FUtors were 
dried, covered in plastic wrap and exposed to X-Omat AR scientific 
imaging film (Kodak) with Cronex intensifying screens (DuPont) at 
-70 °C for 24 h, unless otherwise noted. Oligonucleotides used in po- 
lymerase chain reactions (PGR) 1 and sequencing procedures were syn- 
thesized by the Louisiana State University Core Facility (New Orleans, 
LA) and the Nucleic Acid Synthesis and Analysis Laboratory (Medical 
College of Virgima/Virginia Commonwealth University (MCV7VCU), 
Richmond, VA). Radiolabeled deoxynucleotides were obtained from Du- 
Pont NEN (Boston, MA). 

cDNA Probes — BTD4O0 was liberated from pCRlOOO upon digestion 
with EcoBl and Hinaill. BTD2000 was excised from pBluescript SK 
with Bam HI and Apal. Inserts were isolated from a 1.3% low melting 
point agarose gel, purified by extraction with phenol xhloroform and 
recovered by ethanol precipitation. The p-actin cDNA was obtained 
from Clontech (Palo Alto, CA). cDNAs were radiolabeled with 
la-^ldCTP (3000 CVmmol) (DuPont NEN) using an oligolabeKng kit 
(Pharmacia LKB Biotechnology Inc.). 

Preparation of Tryptic Peptides of Biotinidase — Enzymatically active 
biotinidase was purified 22,000-fold from pooled normal human serum 

(6) . A single amino acid N terminus was found on analysis of the protein. 
A single silver-staining protein was observed on SDS and native poly- 
acrylamide gel electrophoresis (PAGE). Polyclonal and monoclonal an- 
tibodies prepared against the purified serum enzyme detected a single, 
approximately 74-kDa protein in serum on immunoblots of SDS-PAGE 

(7) . Using these antibodies, we have identified several individuals with 
profound biotinidase deficiency who tack cross- reacting material to an- 

1 The abbreviations used are: PCR, polymerase chain reaction; PAGE, 
polyacrylamide gel electrophoresis; bp, base pair(s); kb, JrilobaseCs); Tri- 
cine, N-tris(hyd^xymethyl)methylglycine. 
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tibodies prepared against biotinidase in their serum on SDS-PAGE and 
isoelectric focusing electrophoresis (7). Purified biotinidase was treated 
with trypsin, the resulting peptides were separated by high perform- 
ance liquid chromatography, and 11 were sequenced by Edman degra- 
dation (8) at the Yale University Protein and Nucleic Acid Chemistry 
Facility (W. M Keck Foundation Resource Laboratory, New Haven, CT). 

PCR Conditions and Generation ofcDNA Probes— Three sets of oli- 
gonucleotide primers were derived from the amino acid sequence of 
several tryptic peptides of the biotinidase protein (Fig, 1) and from 
gene-specific DNA sequence. The primers in set 1 are degenerate (1A, 
sense, 32-fold degeneracy; IB, antisense, 96-fold degeneracy), whereas 
those in the second set (2A, sense; 2B, antisense) were generated ac- 
cording to codon usage frequencies (9). The third set consists of a gene- 
specific oligonucleotide (3B, antisense, 5 ' -GCCCCTGATGGCCATA- 
CAACT-3) synthesized from the reverse complement of the DNA 
sequence of BTD300 and a degenerative primer pool designed from the 
N-terminal peptide in which inosine was substituted in positions where 
degeneracy indicated either an adenosine or guanine (3A, sense, 8192- 
fold degeneracy). 

RNA was isolated from frozen human liver by the guanidinium thic- 
cyanate cesium chloride density gradient centrifugation method (5). 
Reverse transcription was performed in a final volume of 50 ul with 10 
ug of total RNA, 100 ng of primer IB, 60 dim Tris-HCl (pH 8.3), 150 ntM 
KCI, 10 mM MgCI 2 , 0.5 mu of each dNTP, 0.5 ug of actinomycin D, 10 nui 
dtthiothreitol, 20 nw 0-mercaptoethanol, and 15 units of avian myelo- 
blastosis virus reverse transcriptase. Following incubation at 42 °C for 

00 min, the RNA was precipitated with ethanol, resuspended in 20 ul of 
5 x PCR buffer (50 mu TVis-HCl (pH 9.0), 50 mM KCI, 15 nut (NH 4 >3S0 4 , 
and 7 mai MgCl a ). The RNA was used as a template in PCR, performed 
in a final volume of 100 ul, with 0. 1 ug of primers 1A and IB, 0.5 mM of 
each dNTP and 5 units of Too polymerase (BioRes). After amplification 
in a thermocycler (NOLA Scientific New Orleans, LA) was performed 
for 40 cycles (92 °C for 1.5 min, 55 °C for 3 min, 72 °C for 4 min), the 
DNA was precipitated with ethanol and used as a template for an 
additional PCR amplification using 0.2 ug of primers 2A and 2B for 20 
cycles (92 °C for 1.5 min, 52 °C for 3 min, 72 °C for 2.5 min). The 
resulting 300-bp fragment (BTD300) was purified by electrocution 
from a 1.5% agarose gel and subcloned into pBluescript KS. Dideoxy- 
nucleotide sequencing (10) was performed with T3 and T7 universal 
oligonucleotide primers (Stratagene, La Jolla, CA). 

A second reverse transcription reaction was performed with 2 ug of 
total RNA, isolated from fresh human liver obtained from the MCV/ 
VCU Human Tissue Acquisition and Histopathology Facility, and 5 ng 
of oHgo(dT) 15 primer (Promega, Madison, WI) as described (5). One-half 
ug of primers 3 A and 3B and 2.5 units of AmpliTaq polymerase (Perkin- 
Elmer Corp J were used in PCR for 45 cycles (94 °C for 1 min, 55 °C for 

1 min, 72 °C for 3 min) with an initial denaturation step at 94 °C for 5 
min and a final extension period of 5 min at 72 °C in a Twin Block 
System PCR thermocycler (Ericomp Inc. San Diego, CA). The resulting 
400-bp putative biotinidase cDNA fragment (BTD400) was directly pu- 
rified by high performance liquid chromatography on an analytical (4 x 
250 mm) Nucleo Pac PA- 100 anion exchange column (Dionex, Sun- 
nyvale, CA) at a flow rate of 1.5 ml/min. The product was fractionated 
using a gradient starting at 95% buffer A (20 mM Tris-HCl, pH 7.3) and 



5% buffer B (20 nui TVis-HCl, 1.0 m NaCI at pH 7.3) increasing linearly 
to 50% buffer B at 10 min and subsequently increasing to 100% buffer 
Bat 40 nun. The product eluted at 23 min and was collected, desalted 
using a C^trram-30 (Amicon, Beverly, MA), and lyophilized. BTD400 
was cloned into the pCRlOOO vector according to the manufacturers 
instructions (TA cloning system, Invitrogen, San Diego, CA). Both 
strands of BTD400 were sequenced using Sequenase 2.0 DNA polymer- 
ase (U. S. Biochemical Corp.) and fiuorescently labeled terminator 
nucleotides (DuPbnt) according to the manufacturer's instructions (11). 
The sequence was analyzed using the GENESIS 2000 system (DuPootX 

Isolation and Sequencing of Biotinidase cDNA Clone—A human liver 
cDNA library cloned in the Uni-ZAP XR vector was obtained from 
Stratagene, Approximately 5 x 10* recombinant bacteriophage were 
screened by plaque hybridization with the BTD400 cDNA probe under 
conditions described under "General Methods." Bacteriophage DNAs 
containing putative biotinidase inserts were isolated (A Quick!, BIO101, 
La Jolla, CA). 0.5-1.0 ug of the DNA was digested with HindJU, sepa- 
rated on a 0.8% agarose gel, and identified by Southern analysis with 
the BTD400 cDNA probe, as described previously. Recovery of 
phagemids (pBluescript SK) from putative purified bacteriophage par- 
ticles was performed in the presence of R408 helper phage according to 
the in vivo excision protocol (Stratagene). 

Plasmid DNAs containing putative biotinidase cDNA inserts were 
purified by cesium chloride density gradient centrifugation (5). Se- 
quencing reactions were performed with 1 ug of plasmid DNA, 3.2 pmol 
of primer, fiuorescently labeled terminator nucleotides (DyeDeoxy™) 
and AmpliTaq DNA polymerase according to manufacturer's instruc- 
tions (Applied Biosy stems, Poster City, CA) in a Perkin-Elmer model 
9600 thermocycler. DNA sequence was determined for both strands 
using a mode] 373A Automated DNA Sequencer (Applied Biosystems). 
Sequence editing was performed using the Genetics Computer Group 
Package (UW Biotechnology Center, Madison, WI) (12). 

Expression of the Cloned Biotinidase cDNA — -In order to perform 
expression studies, the Bluescript plasmid containing the biotinidase 
cDNA was modified so that the cloned insert was in the same reading 
frame as that of 0-galactosidase. Plasmid DNA (3.6 ug) was linearized 
with Bom HI, blunt ends were created by the incorporation of dNTPs (4 
mM each) with the Klenow fragment of DNA polymerase, and the plas- 
mid was religated with T4 bacteriophage DNA ligase. Plasmid DNA, 
isolated by cesium chloride density gradient centrifugation, was se- ' 
quenced from the M13R primer in the sense orientation with a dideoxy- 
nucleotide sequencing kit (Stratagene) and [cr-^SJdATP (600 Ci/mmol). 
The modified plasmid and a control Bluescript plasmid (containing no 
insert) were each transformed into XLlBlue competent cells and in- 
duced with 1 m&i isopropyM-thio-3-D-galactopyranoside for 2.5 h. The 
cells were sedimented by centrifugation, washed three times in phos- 
phate-buffered saline (pH 7.4), and then subjected to a French pressure 
cell (SLM Aminco, Urbana, IL). The extract was centrifuged (1000 x g), 
and 20 ul of the supernatant were electrophoresed in a 12% SDS-PAGE 
TVicine gel (Millipore Corp., Bedford, MA). The gel was then electro- 
blotted onto nitrocellulose (Bio-Rad) as described (13). The membrane 
was pretreated with nonfat dry milk and then incubated with an IgG 
preparation of monoclonal antibodies against purified human serum 
biotinidase (Hybridoma-Monoclonal Antibody Laboratory, MCV/VCU) 



ClnCluAl*L*uClttI.«uMotAaaCX»AaiH*uAspIl«TryCluClnClnValM«tThrAl»AlaClntys 



ATGAA^CA^AA^T^GA 



CAGGAGGCCCTGGAGCTCATGAACCAGAACC TGGACA TCT A t^3AGCAGCAGGTGATGACAGCTGCCCAGAA'l 
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as described by Hart et aL (7). Inununoreactive species were detected 
using the enhanced chemiluininescettce CBCL) system according to the 
manufacturer's instructions (Amershaxn Corp.). 

Genomic Southern Analyses — Total genomic DNA was isolated from 
lymphocytes of an individual with normal serum biotinidase activity 
using a model 340Anuc]eic add extractor (AppBed Biosystems). Ten pg 
of DNA were digested with BomHI, Bgfl, EcoBl, EcoKV, Xbal, or Xhol. 
After electrophoresis through a 0.8% agarose gel, the DNA was trans- 
ferred onto a 0.45-nm Nytran nylon membrane (Schleicher & SchueH) 
as described by Southern (14). A ZOO-ttot containing 8 pg of £coRI- 
digested genomic DNA from nine eukaryotic species was purchased 
from Ckmtech, These membranes were hybridized with radiolabeled 
BTD2000 and exposed to autoradiography for 5 and 14 days, respec- 
tively, as described under "General Methods." After the BTD2000 probe 
was removed from the ZOO-Wot with 0.2 m NaOH at 60 °C for 20 min t 
the filter was hybridized with the human BTD400 cDNA probe, washed 
at conditions of lower stringency (twice in 2 x SSC at 55 °C for 30 min, 
then once in 0.1 x SSC at room temperature for 5 min), and exposed to 
x-ray film for 5 days. 



RESULTS AND DISCUSSION 
Cloning and Sequence of cDNA Encoding Biotinidase — The 
amino acid sequences of 11 tryptic peptides, including the N- 
terminal peptide, were determined by Edxnan degradation. No 
match for any of these sequences was found in the Protein 
Identification Resource, GenBank, or Swiss Protein data bases. 
A reverse transcription polymerase chain reaction was per- 
formed using the degenerate oligonucleotide primers (1A and 
IB) derived from the amino acid sequences of peptide 2 and 
peptide 8, respectively (Pig. 1), and then using a set of primers 
(2A and 2B) generated according to codon usage frequencies. 
Hie resulting 300-bp putative biotinidase fragment (BTD300) 
was determined to contain 186 bp that were unique and not 
included in the primer sequences. Identity between five pep- 
tides (3, 4, 5, 6, and 7) and the amino acid sequences deduced 
from the nucleotide sequence of BTD300 indicated that 



CCCACCTGGACCl 



GGMftljgg CCO CAT CCO CAT ATT CAC CCC CCA AOS CCC OCT AAO ACC ACA TTT CTC CTC TOC ATT AJg TCT CCA CCC ACA ACT ?• 
Ma* Ala Him Ala Hia I la Gin Cly Cly Art A** Ala lya tar Arg rh* Val Val Cya XI* Ma*. Smt Cly Ala Ary itt 2* 



aac err cct err rtc ere roc coc ret tac ctc ctt ccc ctc cca ecc cac aoc coa cab sac aoc cto cct cac cat cac cac oct caa tat tat ctg cct ccc its 

Lya l«i All Uu M» U« C|* Ay Cy> Tfv Vtl VU Alt Ua «ly Ht 111 TtlT fill filB BlB ItT Yll Sit km lilt Bit filn lit ClH TTT TTT Yll iU lit «1 



CTG TAT CAS CAT CCA TCC ATC CTC ACT CTC AAC CCT CTC CCT CTC ATC ACC CCC CAA CAC CCC TTC CAB CTC ATC AAC CAC AAC CTT CAC ATC TAT CAA CAC CAA 2tt 

V i l Tyr fila Hit rrn I t H i 1« Bw \m h m rm Ian l i t l a w Tit lex I ra gin file I U I a n Bin I tsi Htr Km Bin hm ton in Tit Tyr Sin B in filn »* 



CTG ATC ACT CCA CCC CAA AAC GAY CTA CAC ATT ATA CTC TTT CCA CAA CAT COC ATT CAT CCA TTC AAC TTT ACA ACA ACA TCC ATT TAT CCA TTT TTC CAC TTC J»» 

V«l Hit T*r Al. Al, Cl» tv Wl Cl» Tl» ttm V.I >K- Cl„ Tl« ««. BW 1— II, ^ »^ »W tU »h, T^., ^ .W- IJl 
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ATC CCC TCT CCC CAC CTC CTC ACC TOO AAC CCA TCC CTC CAC CCT CAC CCC TTC AAT CAC ACA CAC CTC CTC CAC CCC CTC ACT TCT ATC CCC ATC ACC CCA CAT 499 
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p*pt4*m » pn***» * p*pti*a f p*P*ia* * 

ATC TTC TTC CTC CCC AAT CTT COO ACA AAC CAC CCT TCT CAT ACC ACT CAC CCA ACC TCC CCA AAA CAT CCC ACA TAC CAC TTC AAC ACA AAT CTC CTC TTC ACC *©» 

w»* *hm v*l Al, Wi tt\ Y «v» t r ei* »~ f*- ^ a^ >in rT , ar0 Cy» Tn lym Asp Cly Ary Tyr Cln rfc» A-» The Am V«l V»l rh* tax 201 



AAT AAT CCA ACC CTT CTT CAC COC TAC CCT AAA CAC AAC CTC TAC TTT CAC CCA OCA TTC CAT CTT CCT CTT AAA CTC CAT CTC ATC ACC TTT CAT ACC CCC TTT 70t 

Am Am Cly Tte tea Val Asp Arg Tyr Are Ly* Mim *mn ton Tyx Wtm Cla Al« Al« fhm A*p V*l Fr» L«a Lys t^, n, tt». ^ >». .k. 23C 

CCT CCC ACS TTT CCC ATC TTC ACA TOC TTT CAT ATA TTC TTC TTT CAC CCT CCC ATC ACA CTC CTC ACA CAC TAC AAC CTC AAC CAT CTT CTC TAC CCA ACT CCC »13 

flit BlV Arq Thm Cly XI* Whm TAr Cy» Tte Asp Jim In Tte Tte Asp Tn> Al« XI* Arg Vmi Ua Axe Asp Tyr lym V«l Ly* m. v.i ihUp-n*^ *i. 27 i 

p«ptlA» It 

1 ATC AAC CTT CTC CCA CCT AAT CTC CAC CAC CCA CTT CTC »!• 



TCC ATC AAC CAC CTC CCA CTC TTC CCA CCA ATT CAC ATT I 

Tm ittt itn Bin In rrs In lm ill Bit Tit file Tic I 



i AAA CCT TTT OCT CTT OCC TTT I 
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Oly 


TAT CTC CAC CTC 
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ACT 
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TTC 
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CAC 
Mis 


TAT TTC CTC ACC 
Tyr Pte tea Ar* 
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ACT ACC 
Wr An 


CTC 


TCC TCT GOO CTC CTO ACC 
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I CTC CCC CTC TTT CAT COS CTT 133t 



> CAC CAA ATC ACA CAC CCC ACQ CCC ATA 144) 



IttG 
943 



TCTCACTTTATATTTTACACATCCAAAAAAAAAAAAAAAAAAAA lttl 

Fig. 2. Nucleotide and deduced amino acid sequences of human serum biotinidase. Two potential ATG initiation codous are double 
underlined (numbered aa bases 1 and 61) and could encode signal peptides of 41 and 21 amino acids, respectively. An Ala residue and a Gly residue 
are located at amino acid positions -3 and -1, respectively, to the N terminus of the mature protein, which is consistent with a signal peptidase 
cleavage site (22). An open reading frame of 1629 bp, relative to the first ATG codon and the termination codon (indicated by ***) was present. The 
N terminus of the mature serum protein (starting at base 124) is indicated in uppercase letters. The amino acid sequences of 11 tryptic peptides 
from purified human serum biotinidase are underlined. Six potential N-linked giycosylation sites (Asn-X-Thr/Ser) are noted with asterisks above 
the sequence. The cDNA clone also includes 35 bp 5' of the first ATG codon and 332 bp of 3' -untranslated sequence, with a polyadenylation signal 
(AATAAA) at base 1932 {boldface) located 24 bp upstream from the 20-bp poly(A) tract. 
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? BTD3Q0 'r^resented a regiori of tie biotiiridase gene. In order 
to/obtain a larger portion of i&e biotinidase cDNA ftr useras a 
probe, ah additjotral reverse ^raiiscrip^ni j^l^etase chain 
reaction, performed Jising a gene-specific oligpjuicleotide (3B) 
synthesized from ithe DNA s^eiicfe of BTI&tiO and a degen- 
erative primer pool (3A) designed fifpm the N- terminal peptide, 
yielded a 400-bp putative biotinidase fragment (BTD4(H)>. The 
nucleotide sapience of WFD^i^ nmjb&sdf thaj ofil^SW^nd, 
show^ sequence id en the amino a^lpe^ueiwrof'fiye 

trypticpepUdea(2 ) ,3,4 ) 5,aiid0X 
is $ portion of the biot^ 

Ctae-fourth of the 
human liver cDNA library ^was sheened by plague hybrida- 
tion vnth radiolabeled BTi)^;,A^ be- 
cause the liver appears to b6 the primary source; of serum 
bio&rn^dase (15-17). Two clones* that produced strong hybrid* 
ization signals with 

screen yv&re isolated after a tf^&t^ $cxe&^, HylmdizAtipn be- 
ttw^en- the jprobe and Hinj^^gesied ^a^ripjpiia^ 

DNA isolated jnrom these; two iclqnes indiicated that the, inserts 
<$nt^e& 

largest >;ihsert; were sequenced with universal primers M13R, 

17^ and Ml^3F a^j^&£^^ 

the newly obtained biotinidase sequence. 

(20 1$ bp) and the deduced amino acid sequence are shown in 

Fig. 2: All pej^de#ohtain;^ 

i]bi^m$as^^ 

the cDNA, There were no; peptide sequences that were ndt 
contained in this seq^^ matches or significant homoi^- 

gies with either the DNA or amino acid sequences of biot^ 
were found- in searches of <^iiBank, : EMBLj; Ye^Base, PIR- 
Nucleic, and.:^ER^ro^ cDNA encodes for a 

mature protein of 502 amino adds with a molecular mass of 
56,771 Da, which n& similar to the 60 kpa reported for the 
gl ycanase-treated serum enzyme (13), tte amino acid compo- 
sition of the encoded mature enzyme also compare fkyor5afeiy 
vrit^h previously puoUshe^ a^atlyses (18, .i^aj^^^ iTO'l^ 
published data. Glycosyiation of the protein .v^iSjfe^tM : to 
increase in molecular n^ of {14^23 M>a» jg^yen *at Urn gly- 
^^Ki|#3^: jjs*r^|ptg^- £a#ge> m -fiPSi& 
k0a for sialated biantennary to 3,8 for sialated tetraan- 
: t£m&^ W^^^se. .t^L^^s (20), arid assuming 

tha*^ 

glycosylated. The molecular mass of the glycosylated enzyme is 
estimated M-M ^M^m^^B^^^MM^ m^^^.mp^m^ 
with that of the ^cosylated sariim enzyme reported by out 
laboratory and others (6, 1$, 19, 21X 

■ Eicpreasipp analysis^ of the .c^l^A ihsert ; - was p^rforme^ to 
demonstrate that the cloned ^i&f^ 

cBNA msert was modified into the same reading fi^me ias 

Sequencing of the v pte that the J^^ffl si te was 

abSelnt and that tne cloned msert was- iri the same; reading 
frame as ^^lactosidase, fen^essiqn of this? cloned cDBA 
yielded a protein with an estimated molecular mass of 53 kDja 
t^t reacted iirimunologic^ 
^ing that -this cDNA encodes ^ptinidase. 

Expressions, Copy 
daseiOerie— Northern analysis 

cDrM- probe revealed a ^ 2i0*ib hybridiaation signal in multiple 
human tissues (Fig. 3). Hyjmdij^tiqn ; of the BTO2ip66 prohe 
with genomic DNA digested: with BamHI y 2?cpRI, BcbRV, and 
XH61 produced a suigle hybridization signal at 7.5, 11.6* 13.2, 
and greater : than 24.0 kb, respectively (results not shown)., 
Because these enzymes So not = cleave the biotim&ase cDNA, 
these results suggest that biotinidase is a single copy gene with 




Fio. 3. Northern ana^sis of poJp&^BNA mm multiple hu- 
man tissues. A Northetti blot of poryCk*)' (2 |ij|) isolated fngi: 
mtiliiplelbuioan jfci^es was purchased froroiCldhti^ 4, aft^ 

■raiidir^^i^e ; n^« 

inRNA isolated n^m hear^ prain t jpla^nta, luti^,; liyensMetal ihttsde, 
kld^ie^ a^id p^m^: P^^b\^ sjWfe Mm J^ktaMejbioMffl^' 
dase activities mammafc^ii&n^^ 

ladhey^pancreas, and^earfc W% but little or nq btetiMda^ acti 
^tBiMKan br^ 

te^#e:bl<^fb^ was exposed to aoterajPoM^ for 

j2 : 'oi^^ ensure^a^ 

mRNAs in all lanes was denionstrated by ^e^rel^nce pf thie expected 
>2iQ^h5^di^t?on ,of 

sure to x-ray film for 14 h. The additibhiii l.S^kb Mgnal present in the 
lanes confining the heart,; and^sMeletal muscle mRNAs is due to an 

^i&i^of j i^ 



rip indication of pseudogeries. 

Hybridization was observed between human cDNA ]p#bes, 
BTD400 and BTD2000, and the digested DNA from human» 

:moid^;(l^ 

and cow (data not shown). The hybridization pattern of the 
BTD40Q eDNA-prp^ mammal^n DNAs was. iidto- 

tical to that X>b$eryed between i^e BTD2Q0Q cDNA probe; 

Tabbi^ but not between the BTO2000 ; prol« and 
wjiiK*^^ j»aihin^coiif- 
mknm in the ex^peirimenfc with \WMm^ JNgh 
hum^ ^NA probes did npih^ridiie fepmmm chiefc^n Pr 
yeast CS«ccte«>j7?yces <xrevisiaey under the ^j>eim^n^,v<^^ 
options lised. tl^ the ability of bio- 

^te^^fegrc^ an^ftna- 
logue of biocytin, biotinidase activity has been detected in 
human^rat mqus^^ £15). FaBure 

of the human probes to hyb^ridlze ^ 

. ysis ;i^y > todicat^ is insu^cient homology between 

i&e huanan cDNA probes and '^■. : -^^m^"-!^I^M^^l9^ 
quenoes to allow h^ndizatibn or rthat the conditions used in 
the ansilysis were too striA^BBt for h^ndiM^pn ^0 o&mi 

parameterization 
^and^ho^CjRrpyid^ Insight ;intp j^e genptyp^^h^oi^^ 
tionships m biotmidasa de^ and a better inndersta^drng 
of the clihi^l; e^reMpn, and variabihty of the dlsowler. 8k. 
ad<fi%n,ideitieri^ 

^S^^M^^P^^'Pt^y^ insight about the protein domains: 
that are important for enzyma itactionvW 
:dase in normal metabolisni and nutrition. 

A&nowlixigmerits^We ^ Drs.. Joyce Iioyd *M $iic Wesun tor T 
helpfiil advice.dUri^ 
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tl^ ature dry pea seeds cont ain three major biotinylated proteins. 
OI * tne se of subunit molecular mass about 75 kDa and 
pOO kDa are associated with 3-methyIcrotonyI-CoA carboxylase 
f(EC 6.4.1.4) and acetyl-CoA carboxylase activities (EC 6.4.1.2) 
Respectively. The third does not exhibit any of the biotin- 
||fependent carboxylase activities found in higher organisms and 
^presents the major part of the total protein-bound biotin in the 
|g||eeds. This novel protein has been purified from a whole pea seed 
i^gextract. Because in SDS/polyacrylamide gels the protein migrates 



r? J^ith an apparent molecular mass of about 65 kDa, it is referred 
•Ift-to as SBP65, for 65 kDa seed biotinylated protein. The molecular 
| ^p niass of native SBP65 is greater than 400 kDa, suggesting that 
^llf^the native protein assumes a polymeric structure, resulting from 
Ig^the association of six to eight identical subunits. The results of 
f^^NBr cleavage experiments suggest that biotin is covalently 
;b6und to the protein. The stoichiometry is 1 mol of biotin per 
j mol of 65 kDa polypeptide. The temporal and spatial pattern of 



expression of SBP65 is described. SBP65 is specifically expressed 
in the seeds, being absent from leaf, root, stem, pod and flower 
tissues of pea plants. The level of SBP65 increases dramatically 
during seed development. The protein is not detectable in very 
young seeds. Its accumulation pattern parallels that for storage 
proteins, being maximally expressed in the mature dry seeds. 
SBP65 disappears at a very high rate during seed germination. 
The level of free biotin has also been evaluated for various organs 
of pea plants. In all proliferating tissues examined (young 
developing seeds, leaf, root, stem, pod and flower tissues), free 
biotin is in excess of protein-bound biotin. Only in the mature 
dry seeds is protein-bound biotin (i.e. that bound to SBP65) in 
excess of free biotin. These temporal expression patterns, and the 
strict organ specificity for expression of SBP65, are discussed 
with regard to the possibility that in plants, as in mammals, 
biotin plays a specialized role in cell growth and differentiation. 



INTRODUCTION 

, In all organisms biotin is an essential cofactor for a small number 
of enzymes involved in C0 2 transfer during carboxylation 
reactions [1]. Although the existence of this vitamin is well 
recognized in plants, little is known about its biosynthesis and 
function. It is known that animal cells contain four biotinylated 
fnzymes, acetyl-CoA carboxylase (AGC; EC 6.4.1.2), 3-methyl- 
; qrotonyl-Co A carboxylase (MCC; EC 6.4.1.4), propionyl-CoA 
carboxylase (PCC; EC 6.4.1.3) and pyruvate carboxylase (PC; 
EC 6.4.1.1). They play central roles in a variety of metabolic and 
.^tabolic processes, supporting essential cellular housekeeping 
functions. ACC, which catalyses the ATP-dependent carboxyl- 
ation of acetyl-CoA, is recognized as the regulatory enzyme of 
lipogenesis. MCC catalyses the conversion of 3-methylcrotonyI- 
poA into 3-methylglutaconyI-CoA, a key reaction in the degra- 
dation pathway of leucine. PCC is a key enzyme in the catabolic 
pathway of odd-chain fatty acids, isoleucine, threonine, 
methionine and valine. PC has an anaplerotic role in the 
formation of oxaloacetate [2]. 

, It is now established that plant cells contain the four bio- 
tinylated enzymes found in animals [3]. The most extensively, 
studied biotin carboxylase is ACC, because of its obvious function 
in membrane biogenesis, and because the enzyme isolated from 
vegetative tissues and developing seeds of monocotyledonous 



plants is the target of the very potent cyclohexanedione and 
aryloxyphenoxypropionate herbicides [4-7]. There is evidence 
suggesting that biotin and biotin-containing proteins might play 
specialized roles in regulation of plant development. Thus ACC 
is required for both growth of vegetative tissues and synthesis of 
storage lipids in developing seeds [8,9]. MCC was found to 
increase rapidly during pea leaf development [10]. Also, during 
development of carrot somatic embryos, a 50-fold increase in the 
level of a 62 kDa biotinylated polypeptide has been observed, as 
embryogenic cell clusters developed into torpedo embryos [11]; 
Attention has been focused recently on the existence of free 
biotin in plant cells ; A study conducted with pea leaves showed 
the existence of a free biotin pool in the cytosolic compartment, 
accounting for about 90 % of the total (free plus protein-bound) 
biotin. It has been argued that this pool might control the 
expression of genes encoding biotinylated carboxylases and/or 
enzymes involved in biotin synthesis [12]. The role of biotin in 
plants is also best illustrated by the discovery of a mutation that 
causes defective embryo development in Arobidopsis thaliana and 
requires the vitamin at a critical stage of embryogenesis. Thus no 
biotin was detectable in the arrested embryos of the mutant, and 
mutant embryos were specifically rescued when grown in the 
presence of biotin [13,14]. 

As a means of understanding the function of this vitamin in 
developing and germinating seeds, we have analysed the content 



■ Abbreviations used : ACC. acetyl-coA carboxylase (EC 6.4.1. 2.); DTT, dithiothreitot; MCC, 3-methylcrotonyl-CoA carboxylase (EC 64 1 4)- PBST PBS 
: ^rd^Iu 9 °' 1 % {V/V) Tween 20; PC> Py ruvate carboxylase (EC 6.4.1.1.); PCC, propionyl-CoA carboxylase (EC 6.4.1.3); PEG, polyethylene glycol)' 

■ the major biotinylated protein in mature dry pea seeds. 
To whom correspondence should be addressed. 
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of free and* protein-bound-- biotin— of pea seeds at various 
developmental stages. A detailed comparative study was also 
carried out with various organs of pea plants (leaves, roots, 
stems, flowers and pods). The general finding is that in mature 
dry seeds the level of bound biotin is higher than that of free 
biotin, owing to the existence of a seed-specific biotinylated 
65 kDa polypeptide which we called SBP65. This protein does 
not support in vitro any of the four biotin-dependent carboxylase 
activities present in higher organisms. This contrasts with the 
situation observed in all developing organs, including young 
developing seeds and pea plants, where the free vitamin is present 
in higher amounts than that bound to proteins. The possible 
implications of these findings are discussed. 



MATERIALS AND METHODS 
Reagents 

ATP, acetyl-CoA, 3-methylcrotonyl-CoA, D-biotin, biotin- 
labelied^-galactosidase, biotin-labelled molecular-mass markers, 
anti-biotin peroxidase-conjugated antibodies, horseradish 
peroxidase-conjugated streptavidin, 2,2'-azinobis(3-ethylbenzo- 
thiazoline-6-sulphonic acid) and 4-chIoro-l-naphthol were from 
Sigma. NaH 14 C0 3 (53.1 mCi/mmol) and T>-[carbonyl- l *C]biotin 
(56.6 mCi/mmol) were from Amersham. Ail other chemicals 
were of analytical grade. Solutions were made in ultrapure water 
(Millipore) and filtered with 0.2 /on filters. 



Plant material and samples for developmental studies 

Mature dry pea seeds (Pisum sativum, cv. Douce Provence) used 
in this study are referred to as SO seeds. They had a mean fresh 
weight of 250 mg and a diameter of about 6-7 mm. Plants were 
grown from these seeds in soil under a 12 h photoperiod of white 
light from fluorescent tubes (10-40 /dE- ml" 2 ■ s~ l ) at 20 °C. They 
were watered every day with tap water. Different organs (leaves, 
roots, stems, flowers, pods and seeds) were harvested at different 
times, and stored at -75 °G until use. No germination was 
visible at day 1 after planting. The mean fresh weight of the 1- 
day-old seeds was 475 mg. Germination occurred between 36 h 
(0% of seed germination) and 48 h (approx. 90% of seed 
germination) after sowing. The mean fresh weight of the 2-day- 
old seeds was 528 mg. At day 2 and day 3, mean fresh weights of 
the seedlings (minus cotyledons) were 62 and 120 mg respectively. 
Flowering started at day 25. During the initial period of seed 
formation seed samples showed heterogeneity in size. S36(l), 
S36(2) and S36(3) refer to seeds collected from pods at day 36^ 
Mean fresh weights per seed were 7, 32 and 144 mg respectively. 
Samples sizes were approx! "1.5, 2.5 and 3.5 mm diameter, 
respectively. S41, S52 and S76 refer to seeds collected at day 41 ' 
52 and 76. Mean fresh weights per seed were 300, 500 and 
210 mg respectively. The size of these three samples was approx. 
6-7 mm diameter. At day 76, seeds were fully matured and 
desiccated. 



Preparation of extracts for development studies 

Plant materials (1-5 g) were frozen in liquid nigrogen and finely 
ground using a mortar and pestle. The powder was homogenized 

1° ^ 0l ™ Uffer A [5 ° mM Hepes ' P H 8 -°- 10 % (v/v) glycerol, 
lrnM EDTA, 5 mM dithiothreitol (DTT), 1 mM phenyl- 
methanesulphonyl fluoride, 1 mM benzamidine/HCI 5 mM 
6-aminohexanoic acid], followed by centrifugation (40000 * 



30 min). The supernatant comprised the crude extract which was 
processed as follows, (i) Portions of the extract were used for the 
determination of total (free plus protein-bound) biotin. (ii) To 
I ml of crude extract, 4 mi of cold acetone (—20 °C) was added 
and the mixture left to stand for 30 min to precipitate proteins' 
After centrifugation (3000 10 min), the supernatant was 
evaporated to dryness under a stream of nitrogen, and 1 ml of 
PBS containing 0.1 % (v/v) Tween 20 (PBST) was added. This 
solution was used for quantification of free biotin. Controls, in 
which D-[ 14 C]biotin was added to the extracts before the addition 
of acetone, established that the radioactivity was quantitatively 
recovered from the supernatant fraction. Also, it was verified 
that the supernatants were depleted of protein-bound biotin. (hi) 
Crystalline (NH 4 ) 2 S0 4 was added to each crude extract with 
stirring until 50 % (w/v) saturation was achieved. The mixture 
was stirred for 30 min at 4 °C, and then centrifuged (40000^, 
20 min). The pellet was resuspended in 2 vol. of buffer A, and the 
solution was used for biotin carboxylase activity measurements 
and for quantification of protein-bound biotin. For SDS/PAGE 
analyses, 1 vol. of SDS sample buffer [10 mM Tris/HCl, pH 6.8, 
2.5% (w/v) SDS, 15% (v/v) 2-mercaptoethanoi, 30% glycerol] 
0.06 % (w/v) Bromophenol Blue] was added to 2 vol. of each 
sample and heated to 100 °C for 5 min. 

Protein determination, activity measurements and electrophoresis 

Protein was quantified by the method of Bradford [15] with BSA 
as the standard. Biotin carboxylase activities were measured as 
the incorporation of radioactivity from NaH 14 C0 3 into an acid- 
stable product [10,1 1]. Assays contained 50 mM Hepes, pH 8.0, 
2.5 mM MgCl 2 , I mM ATP, 2 mM DTT, 10 mM NaH^COg 
(I mCi/mmol), 20 mM KC1, 0.4 mM appropriate substrate 
(acetyl-CoA, 3-methylcrotonyl-CoA, propionyl-CoA or pyru- 
vate) and 1-150 /ig of protein sample, in a final volume of 
200 fi\. Incubations were for 15 min at 30 °C. One unit of enzyme 
activity is equivalent to the incorporation of 1 nmoi of 14 CO. into 
acid-stable product in 1 min at 30 °C. SDS/PAGE was conducted 
with a PhastSystem (Pharmacia) in preformed gels (PhastGel 
Pharmacia) containing 12.5% (w/v) acrylamide. Proteins were 
electrotransferred from the gel on to nitrocellulose (Bio-Rad), 
using a PhastTransfer device (Pharmacia), as recommended by 
the manufacturer. Blots were incubated for 1 h at 25 °C in PBS 
containing 3 % (w/v) BSA, then for 1 h with peroxidase- 
conjugated streptavidin (250 ng/ml) in PBST, followed by three 
washes with PBST. Biotinylated polypeptides were revealed by 
adding peroxidase/substrate solution (1 pt\ of 30% H 2 0 2 /ml and 
0.5 mg of 4-chloro-l-naphthol/mI). 

Streptavidin-binding assays for biotin 

Since streptavidin binds both free and protein-bound biotin [16], 
two types of assays were used. 

Direct solid-phase biotin assay for protein-bound biotin 

Microtitre e.l.i.s.a. plates (Greiner) were incubated for 3 h at 
25 °C with various amounts of sample (0.01-10 //g of protein per 
well) in 100 /d of PBS, washed four times with PBST, and each 
well received 50 ng of peroxidase-conjugated streptavidin in 
100 //I of PBST. After incubation for 1 h at 25 °C, wells were 
washed four times with PBST, and each received a 100 //l 
peroxidase/substrate solution [0. 1 /d of 3 % H 2 0 2 and 100 fig of 
2,2'-azino[bis-(3-ethylbenzothiazoline-6-sulphonic acid)] in 100 
mM citrate phosphate, pH 4.0. Colour development was quanti- 
fied with an EL-312 microplate reader (Biotek Instruments), 
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pg a 405 nn> filler. Stand ard c urves were generated from 
^trol wells coated with known amounts (1-20 ng) of bio tin- 
lled /?-galactosidase. 

jrect solid-phase biotin assay for free and protein-bound biotin 

^ntrol experiments showed that free D-biotin cannot bind to 
% wells of the microtitre plates. Free biotin was quantified by 
Modification of standard procedures [13,17]. E.l.i.s.a. plates 
e incubated for 3 h at 25 °C with a fixed amount of biotin- 
clled ^-galactosiclase (usually 100 ng per well) in 100 /d of 
S, and then waslied four times with PBST. To generate a 
dard curve, serial dilutions (0.002-80 ng) of D-biotin (final 
jjiime 100 fi\ in PBST) were prepared in Eppendorf centrifuge 
ffcs. They were mixed with 100 /i\ of PBST containing a fixed 
*punt (usually 20 ng) of peroxidase-conjugated streptavidin, 
Owed by incubation for 1 h at 25 °C. This yielded a series of 
tures containing various amounts of free and biotin-bound 
^xidase^conjugated streptavidin. Portions (lOOyttl) of these 
" tures were then transferred to each biotinylated fi- 
^Jactosidase-coated well of the plates. After incubation for 1 h 
J25°C, plates were processed as for the direct solid-phase 
Jotin assay, i.e. they were washed four times with PBST and, 
~er addition of the peroxidase/substrate solution, A i05 was 
Jasured as above. Because only free peroxidase-conjugated 
sjreptavidin will bind to biotinylated y?-galactosidase, and 
J —ause exchange of biotin from preformed biotin-streptavidin 
tiplexes is extremely slow [16], this yielded a titration curve, 
versus [o-biotin], such that (i) the lower the [biotin] the 
, _Jgher the A 4fi5 and (ii) at high [biotin] the A 4Q& was close to the 
Background values. The point at which the slope undergoes a 
J*P increase indicates the equivalent point, i.e. the minimal 
||ount of biotin required to saturate the fixed amount of 
pjijugated streptavidin used in the assay. The concentration of 
jfe D-biotin at the equivalence point was not dependent on the 
Ration of the incubation with peroxidase substrates, although 
|g?longed incubations increased the sensitivity. Furthermore, 
^.concentration of free D-biotin at the equivalence point was 
Jjpendent on the concentration of conjugated streptavidin, but 
:|t on that of biotin-labelled /?-galactosidase in the assays. The 
; Jfe itivit y wa s of the order of 2 pg of biotin. 
'~gjo quantify free biotin in the various plant extracts, the same 
£0cotol was used except that the serial dilutions of free D-biotin 
to construct the calibration curve were replaced by serial 
itions in PBST of the supernatants derived from the acetone- 
ited plant extracts. Free biotin content in these samples was 
.^"calculated from the equivalence points, with reference to the 
..bration curve, obtained under the same conditions. In prin- 
Protein-bound biotin could also be quantified by this assay 
Jfom the precipitate of the acetone-treated samples. Yet, attempts 
||;solubilize the acetone pellet from the various extracts under 
^ditions compatible with preservation of streptavidin-binding 
Igivity (e.g. in the absence of denaturing agents such as SDS) 
$duced unreliable results. The same limitations were encoun- 
tered when proteins in the crude extracts were precipitated with 
^Ghloroacetic acid. The unprocessed crude extracts and the 
^ reSp ° ndmg acetolie supernatants were used therefore for total 
pee plus protein-bound) and free biotin determinations re- 
latively. Whenever possible, the level of protein-bound biotin 
^s then calculated from the difference between these two 
Uantities. 



Rcation of SBP65 from mature pea seeds 

*l purification steps were carried out at 4 °C. Chromatographic 



steps were performed with a Pharmacia f.p.l.c. system. Frozen 
( — 75 °C) mature pea seeds (100 g) were finely ground in a 
Waring blender. To the powder 500 ml of buffer A was added, 
and the mixture was homogenized with a Polytron homogenizer. 
After centrifugation (15000 g, 30 min), the resulting supernatant 
(17 g of protein) was brought to 200 g/1 (NH 4 ) 2 S0 4 , stirred for 
30 min at 4 °C, allowed to stand at this temperature for 2 h, and 
then centrifuged (15000#, 30 min). The pellet (4 g of protein) 
was resuspended in 200 ml of buffer A. This protein extract was 
brought to 50 g/1 polyethylene glycol) (PEG) 6000, stirred for 
20 min at 4 °C, and centrifuged (15000 30 min). The derived 
supernatant was brought to 200 g/1 PEG 6000, stirred for 20 min 
at 4 °C, and centrifuged as above. The sticky brown-coloured 
pellet was resuspended in 15 ml of buffer A with an Ultra-Turrax 
homogenizer; then the suspension was clarified by centrifugation 
(50000^, 30 min), and the supernatant (0.75 g protein, 12 ml) 
filtered on a 0.2 ^m filter. At this step, biotinylated proteins had 
been purified about 22-fold with a 60% recovery. This extract 
was subjected to monomeric-avidin affinity chromatography [10, 
18-20]. Sample was loaded (flow rate 0.1 ml/min) on the column 
(1 .5 cm x 20 cm) equilibrated in buffer B (20 mM Hepes, pH 8.0, 
10% glycerol, 0.5 M KC1, 1 mM EDTA, 1 mM benzamidine/ 
HC1, 5 mM 6-aminohexanoic acid). After the column had been 
washed with 350 ml of buffer B (flow rate 0.2 ml/min), bound 
proteins (0.8 mg) were el u ted with 50 ml of buffer B containing 
2 mM D-biotin. The yield for protein-bound biotin was in the 
range 20-30%. Eluted fractions were dialysed against buffer B 
containing 50 mM KC1, concentrated to 6 ml with Macrosep-10 
tubes (Filtron) and applied to a Mono Q HR5/5 column 
(0.5 cm x 5 cm; Pharmacia) equilibrated in buffer B containing 
50 mM KC1. SBP65 was recovered in the non-absorbed fraction. 
After the column had been washed with 20 ml of buffer B 
containing 0.1 M KC1, adsorbed proteins, supporting ACC and 
MCC activities, were eluted with 10 ml of buffer B containing 
0.3 M KC1 (flow rate 0.5 ml/min; fraction size ^ 0.5 ml). 
Molecular-mass determinations were effected by gel filtration 
on a Sephacryl S-300 HR column (2.6 cm x 35 cm, 180 ml; 
Pharmacia). Protein samples were allowed to react witlrCNBr as 
described [21]. 

Preparation of antiserum 

Purified SBP65 (800 /ig) was subjected to SDS/PAGE. After 
electroelution from the gel with 25 mM Tris/192mM glycine, 
pH 9.0, containing 0.1 % SDS, the protein was dialysed against 
50 mM Tris/HCl, pH 7.8. Antibodies directed against the electro- 
cuted SBP65 (anti-SBP65) were raised in a rabbit by standard 
protocols. For the preparation of affinity-purified antibodies, 
SBP65 (30 /ig) was subjected to SDS/PAGE and electro- 
transferred on to nitrocellulose. A thin strip corresponding to 
SBP65 was excised. After being washed with PBS containing 3 % 
BSA, the strip was incubated with a 20-fold dilution of the anti- 
SBP65 serum, overnight at 4 °C. After three washes with PBST, 
specific antibodies were eluted by a quick (30 s) wash of the strip 
with 1 ml of 0.2 M glycine/HCl, pH 2.2, and then immediately 
neutralized by adding 170/d of 1 M Tris/HCl, pH 8.8 [22]. 

D-["C]Biotin binding and exchange 

Assays contained 50 mM Hepes, pH 8.0, 1 mM DTT, 1.6 /tM d- 
[ l4 C]biotin (56.6 mCi/mmol) and sample (10-200 ft% of protein), 
in a total volume of 15 /d. Assays were incubated at 25 °C for 
30 min, and then portions (9 /d) of reaction mixtures were spotted 
on glass-fibre filters. After eight washing steps in 5 % trichlor- 
acetic acid, the filters were dried, and the trichloroacetic- 
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Figure 1 Biotin content of pea seeds 
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nomenclature of seed samples, see the Materials and methods sect.on. Free biotin was quantified as in (a). Protein-bound biotin was estimated by using the direct solid phase biotin assay. 



acid-precipitable radioactivity was measured by liquid-scintil- 
lation counting. 

RESULTS 

Non-covalent protein-bound biotin in extracts from mature dry 
pea seeds 1 

Biotin-containing proteins from plant extracts have usually been 



analysed on nitrocellulose filters of blotted gels, in a system using 
streptavidin as a specific reagent, analogous to Western blotting 
[10,11,23]. Presumably, only those proteins containing the 
covalently attached prosthetic group are detected by this method 
[23]. To quantify the bound biotin content of pea seeds, it was 
therefore of importance to investigate whether these seeds 
contained proteins involved in non-covalent binding of the 
vitamin. When an unprocessed crude extract from SO seeds was 
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||! Figure 2 Requirement for biotin in early plant growth 

:{B) Evaluation of ACC activity in pea leaves, (b) Evaluation of MCC activity in pea roots. 
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Figure 3 Distribution of SBP65 in different compartments of mature dry 
pea seeds 

(a) Coomassie Blue R250 staining of a homogeneous SDS/polyacrylamide gel containing 12.5% 
acrylamide loaded with total protein extracts from seed coat (lane 2, 1 /iq), embryonic axis (lane 
3, 4 jiq) and cotyledons (lane 4, 4 /*g) of mature dry pea seeds. Molecular-mass markers (lane 
1) are indicated in kDa. (b) Nitrocellulose blot of a duplicate of (a) probed with peroxidase- 
conjugated streptavidin. Lane 1, cotyledon extract (40 /*g); lane 2, embryonic axis extract 
(40 jog); fane 3, seed coat extract (10 /*g). Molecular-mass biotinyiated markers (lane 4) are 
indicated in kDa. 



Table 1 Purification procedure for pea SBP65 

SBP65 was purified starting from 100 g of mature dry pea seeds. Protein was determined by 
the method of Bradford [15] with BSA as standard. Bound biotin was measured by using the 
direct solid-phase biotin assay described in the Materials and methods section. 





Total protein 


Total biotin 


Purification stage 


(mg) 


(/*g) 


Crude extract of 


17000 


10 


soluble proteins 






PEG 6000 homogenate 


750 


6.4 


Monomeric-avidin 


0.8 


2 


Sepharose pool 






Mono Q HR5/5 pool 


0.15 


0.2 


(salt-eluted fraction) 






Mono Q HR5/5 pool 


0.6 


2 


(flow-through fraction) 








10 20 30 

Time after sowing (days) 



40 



incubated with D-[ 14 C]biotin, no radioactivity was recovered in a 
form precipi table by trichloroacetic acid. Under the same con- 
ditions, binding of D-[ 14 C]biotin to streptavidin could be demon- 
strated. Therefore, unlike egg yolk and egg white [24], mature dry 
pea seeds do not seem to contain significant amounts of proteins 
involved in non-covalent binding of biotin. 

Free and protein-bound biotin in developing, mature and 
germinating pea seeds 

Figure 1(a) shows that mature dry seeds contain a 5-fold higher 
amount of protein-bound biotin than free biotin. Respective 
amounts of about 24 ng and 5 ng of vitamin per seed were 
calculated, corresponding to about 100 pg of protein-bound 
biotin and 20 pg of free biotin per mg fresh weight. These values 
were within the range of total biotin levels reported for various 
feedstuffs of plant origin [25] and for seeds of A. thaliana [13]. 

A considerable amount of the initial protein-bound biotin 
disappeared during early stages of germination. At day 2, which 
coincides with radicle emergence, the level of bound biotin was 
of the order of 20 % of that in the mature dry seeds. After day 
5, this level was below detection (Figure lb). Figure 1(c) shows 
the data obtained from seed samples collected at various de- 
velopment stages. On a per mg of protein basis, the level of free 
biotin was high in the young seeds, and then decreased during 
development. In contrast, the protein-bound biotin level was low 
in the young seeds, and then rose sharply during late stages of 
seed development and onset of desiccation, being maximal in the 
mature dry seeds. 

Content of free and total biotin in various organs of pea plants 

Biotin contents were determined for various organs of pea plants 
collected from 4 to 40 days after sowing. All extracts from leaves, 
roots, stems, pods and flowers yielded the same results as those 
obtained for the young pea seeds, i.e. although free biotin could 
be easily detected with the indirect solid-phase biotin assay, the 
amount of protein-bound biotin was almost negligible compared 
with that of free biotin. At day 39, free biotin levels of 11, 8.5, 
7.5, 2 and 1 ng/mg of protein were calculated for the leaves, 
roots, stems, pods and flowers respectively. These levels varied 
during plant growth. For example, at day 5 and 25 the free biotin 
content was 1 and 9 ng/mg of protein for the leaves and 3 and 
10 ng/mg of protein for the roots. This is in agreement with 
reported values of 1 .2 ng/mg of protein and 0.2 ng/mg of protein 
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Figure 4 Purification of SBP65 by chromatography on Mono Q HR5/5 
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for the free and protein-bound biotin content of 10-day-old pea 
leaves respectively [12]. 

Although the protein-bound biotin content was low extracts 
from leaves, roots, stems, pods and flowers contained significant 
levels of ACC and MCC activities. Those for PCC and PC were 
close to background values. ACC activity increased strongly in 
the leaves from day 3 to day 8, and then declined in older tissue 
nif^t 2 ^ S,milariy to P revio "s observations with maize leaves 
[26J. The same pattern of behaviour was observed for MCC in 
the roots (Figure 2b). These data suggest that there is a strong 
demand for biotin in the early stages of plant growth. 

Characterization of biotinylated proteins from mature dry pea 
seeos 



Seed coats, cotyledons and embryonic axes were excised from 
mature dry seeds, and extracts were prepared as described in the 
Matenals and methods section. Figure 3 shows that a major 
b o inyiated polypeptide of molecular mass 65 + 2 kDa wa 
present in both cotyledons and embryonic axis. This protein is 
referred to as SBP65, for 65 kDa seed biotinylated prote n On a 



per mg of protein basis, the level of SBP65 was nearly the same 
in both seed compartments. As the total protein content of 
cotyledons is much higher than that of embyonic axis, the greater 
proportion of SBP65 is in the cotyledons. In contrast. SBP65 was 
not detected in the seed coat (Figure 3). 

SBP65 was purified from mature dry pea seeds by a two-step 
chromatographic procedure (Table 1). Proteins specifically eiuted 
from the monomeric-avidin Sepharose column were separated 
into two groups after chromatography on Mono Q HR5/5 
(Figure 4). One group, which did not absorb to this column, 
comprised the major part of the loaded biotinylated proteins 
(Figure 4a) and contained SBP65 (Figure 4d). This fraction 
exhibited no biotin carboxylase activity, using any of the four 
acceptor substrates, acetyl-CoA, 3-methyIcrotonyl-CoA (Figure 
4b), propionyl-CoA and pyruvate (not shown). Another group 
(the salt-eluted fraction) supported MCC and ACC activities 
(Figure 4b). Again PCC and PC activities were not detectable. 
SDS/PAGE analysis of this fraction disclosed the presence of 
four major polypeptides (Figure 4d). The molecular masses of 
^nV^ 66 fast " mi g ratin g polypeptides were 75 kDa, 65 kDa and 
50 kDa (Figure 4d). That of the slowly migrating species was 
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#SRgure 5 Evaluation of SBP65 in pea seeds and cotyledons during germination 

Coomassie Blue R250 staining of a homogeneous SDS/polyacrylamide gel containing 12.5% acrylamide loaded with total protein extracts from mature seeds (SO) (lane 2), seeds collected 
^ 1 * lane ^ and c 0 ^ 6 ^ 005 collected at day 3 (lane 4), 5 (lane 5) and 7 (lane 6) after sowing. An amount of 4 /*g of protein was loaded in each lane. Molecular-mass markers (lane 1) are 
^Wndiwjed in kDa. (h) Nitrocellulose blot of a duplicate of (a) probed with peroxidase-conjugated streptavidin. Lanes 1, 2, 3, 4 and 5 correspond respectively to lanes 6, 5. 4, 3 and 2 in (a). An 
^"W? 1 of 16 ^g of protein was loaded in each lane. Molecular-mass biotinylated markers (lane 6) are indicated in kDa. (c) Evaluation of total protein (■), protein-bound biotin (O) and anti- 
pBP65 antibody reactivity (•) in seeds (day zero and 1) and cotyledons (day 2, 3, 5, 7 and 8) at different times after sowing. Radicle emergence is indicated by the arrow. Results are expressed 
'*T percentage of corresponding SO values, (d) Evaluation of ACC ((H) and MCC activities (■). 



Intimated to be of the order of 200 kDa on SDS/PAGE carried 
Hitjin gels of higher resolution than that of Figure 4, containing 
117820% polyacrylamide gradient. Of these four polypeptides, a 
*^C|tern blot analysis with peroxidase-conjugated streptavidin 
|v|aled that three (the 65 kDa, 75 kDa and 200 kDa proteins) 
>re biotinylated (not shown). The 65 kDa polypeptide corre- 
|>ci|ided presumably to contaminating SBP65, which would 
l|^|unt for the slight reactivity of the anti-SBP65 antibodies 
#itli the salt-eluted fraction (Figure 4a). The 75 kDa biotinylated 
Iftfdtein and the 50 kDa non-biotinylated protein could be 
pl^gned to the two non-equivalent subunits of MCC. Indeed, 
latter enzyme was purified to homogeneity from pea leaves 
IP|§ similar molecular-mass subunits [10]. The 200 kDa poly- 
Jjtide corresponded presumably to ACC, as this enzyme was 
l^fied from pea embryos with similar molecular mass [27]. We 
$9$P that the purification protocol used by Bettey et al. [27] 
l|hided a fractionation step on DEAE-Sepharose, and that only 
jKNf adsorbed proteins were analysed further for purification, 
i&ffiymably, SBP65 was not detected in the above cited experi- 
jM|ts, because the protein adsorbs to neither Mono Q HR5/5 
pF re 4 ) nor DEAE-TSK (Merck) (not shown). The fact that 
KfpOO kDa and 75 kDa polypeptides were almost not detected 
l||rude extracts (Figure 3) confirmed that SBP65 is the major 
Biotinylated protein in mature dry pea seeds. The purification 
¥|cedure yielded about 600 /tg of SBP65, starting from 100 g of 
||ure dry pea seeds (Table 1). 



Biochemical properties of SBP65 

An acid hydro ly sate [12] of SBP65, but not a native sample of 
this protein, supported bioB105 bacterial growth (this strain 
requires added biotin for growth [28]), suggesting that biotin was 
covalently bound to the protein. This finding was further 
supported by the following observations, (i) SBP65 was specific- 
ally detected on nitrocellulose filters of blotted SDS/ 
polyacrylamide gels with peroxidaserconjugated streptavidin 
(Figure 3) ; (ii) there was no binding or exchange of biotin after 
incubation of extensively dialysed SBP65 with D-[ 14 C]biotin in 
the temperature range 25—70 °C, and in the pH range 4-9 ; and 
(iii) SBP65 reacted positively and specifically in e.l.i.s.a. con- 
ducted with the anti-bio tin peroxidase-conjugated antibodies. 
When SBP65 was allowed to react with CNBr (which cleaves 
polypeptide chains on the carboxyl side of methionine residues) 
and the resulting mixture assayed for protein- bound biotin, the 
reacted sample contained less than 0.5 % of the biotin present in 
the native SBP65 control. An indirect solid-phase biotin assay 
showed that biotin was released in the supernatant when the 
treated SBP65 was precipitated by cold acetone. The same 
behaviour was observed with a sample of biotin-carboxylase 
mixture, as obtained from the Mono Q column. As the consensus 
sequence for covalent binding of biotin to biotin carboxylases is 
Met-Lys-Met, where the vitamin is attached to the lysine residue 
[2,29], these results suggest a similar mode for binding of biotin 
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Figure 6 Evaluation of SBP65 during seed formation and maturation 
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to SBP65. The biotin content of SBP65, as determined using the 

n^Tn .\° ' d T h ru- bi0Un aSSay - yielded a sto.chiometry of 
O.V5±O.I5 mol of biotin per 1 mol of 65 kDa polypeptide The 
native molecular mass of SBP65 was estimated by gei filtration 
on Sephacryl S-300 HR to be 450 + 60 kDa. This indicates !ha" 
the native protein assumes a polymeric structure, resulting from 
the association of six to eight identical subunits. 



Developmental patterns of SBP6S expression in pea seeds 

Because SBP65 appears to correspond to the major biotinylaced 
protein in pea seeds, one might expect similar developmental 
patterns for the expression or this protein and for that of total 
protein-bound biotin. In agreement with this expectation tlr-re 
was a rapid fall for SBP65 from seeds and cotyledons in early 
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i|ps of germination (Figure__5hX- whick was even more rapid 
!g£ that of total extractable proteins (Figure 5c). It is known 
^ in Pisum sativum, mobilization of protein reserves from the 
fyiedons commences just after radicle emergence [30]. SBP65 
^ absent in the initial period of seed development, and then 
[cumulated considerably in later stages. This accumulation 
iralleled that of storage proteins (Figures 6a, 6b, and 6c). 
Clearly, the development patterns of ACC and MCC activities 
fered from that of SBP65 (Figure 5d and Figures 6e and 6f ). 
the developing seeds both ACC and MCC appeared to be 
nstitutively expressed, being present at all developmental 
ges. Also, the Western-blot analyses of Figures 5(b) and 6(d) 
afirmed that the two bio tin-dependent carboxylases were only 
isent in much smaller amounts that SBP65 in the final period 
seed maturation. 



mm 



ian-specific expression of SBP65 




:aves, roots, stems, pods and flowers were harvested from 
iants from day 4 to 40 after sowing, and crude extracts were 
epared as described in the Materials and methods section. In 
ntrast with the results obtained with the mature seeds, SBP65 
[as not detectable in these extracts, after either incubation of 
trocellulose filters of blotted SDS/polyacrylamide gels with 
roxidase-conjugated streptavidin or by conducting e.Li.s.a. 
th the anti-SBP65 antibodies. The expression of SBP65 is 
ierefore confined exclusively to the seeds. 



^^pSCUSSION 

major rationale of this work was to characterize the free and 



|g||||otein-bound biotin contents of pea seeds, and to investigate 
jSl|P net k er the tem P oraI expression of these contents can be used to 
^^^fine developmental stages during seed formation and ger- 
^^mination. Ion-exchange chromatography on Mono Q HR5/5 
"Uows one to separate the biotinylated proteins of mature dry pea 
:eds into two classes. The first one is not adsorbed by this 
w^fW lumn ' and is assoc i ated wit h a single protein which we have 
^^^lled SBP65. This protein is seed specific, being absent from 
5 3lil aves ' roots > stems, pods and flowers. No other biotin-containing 
otein, exhibiting such a strict organ-specific expression, has yet 
|§§|een reported in plants. A second class of proteins is retained by 
te Mono Q column, and is mainly associated with MCC and 
CC activities. In sharp contrast with SBP65, these two bio- 
^^■ ny ^ atec * en zymes do not exhibit specific organ expression, being 
|||| etecte d at all developmental stages in all tissues examined. 
We have analysed the relative levels of free and protein-bound 
|^p» otin in various organs of pea plants. For all proliferating 
* ^ll|lP ues > includin g ^e young developing seeds, free biotin is in 
.jPil^ SS of P rotem - DOund biotin. Presumably, in such tissues free 
p^^otin plays a specialized role in growth, and biotinylation of the 
"^l^ffiotin-dependent carboxylases can only be achieved with an 
|J^ cess °f tn e free vitamin. Studies with mammalian cells suggested 
|p||pat biotin might be involved in regulation of diverse processes 
Willi* 0 * 1 aS DNA tran scription and replication, cell growth and 
P|||^fferentiation, highlighting functions for the vitamin other than 
,J$Mim a P rostne Uc group for the biotinylated carboxylases [2]. In 
■ ^^^ ar ^ e( * contrast, only for the mature dry pea seeds is the level of 
^ yf ound Diotin greater than that of free biotin, owing to the 
^cumulation of SBP65. 

i|§ In considering possible roles for SBP65 in seed development, 
||B is plausible that kinetic competition between different protein 
Receptors for biotin may divert the normal flux of the vitamin 
oni the housekeeping biotinylated carboxylases to SBP65. In 



favour of this hypothesis, the results of CNBr cleavage experi- 
ments suggest a similar mode for binding of biotin to the pea seed 
biotinylated carboxylases and to SBP65. Also, SBP65 might 
constitute a storage form of the vitamin, reserved for the embryo 
to start its growth during the germination process. It is worth 
noting in this context the very fast decrease in the level of SBP65 
after seed planting. Other possibilities are that SBP65 supports 
some as yet unidentified enzyme activity, or that it has a 
regulatory function at the level of gene expression. As the 
sequences of several biotinylated proteins are now available 
[31-34], more definitive evidence about the function of SBP65 
can be acquired by cloning and determination of the nucleotide 
sequence of the gene for this protein. An interesting parallel to our 
observations with the pea seeds is found in a recent study 
showing the accumulation of a 62 kDa biotin-containing poly- 
peptide during developing of carrot somatic embryos [1 1], If the 
temporal pattern of expression of major biotinylated poly- 
peptides in various seeds were similar, further characterization of 
pea SBP65 may provide valuable molecular insights for under- 
standing the role of biotin and protein biotinylation in seed 
development and germination. 

This study was conducted as part of the BioAvenir program financed by Rhone- 
Poulenc with the contribution of the Ministere de la Recherche et de TEspace and 
the Ministere de I'lndustrie et du Commence Exterieur. We are grateful to Thierry 
Degache for growing the pea plants, Laure Dehaye, Pierre Baldet and Salvatore 
Sparace for their help in the biotin content determinations and initial purification of 
SBP65, Jacques-Henri Julliard for his advice on the CNBr cleavage experiments, and 
Rick De Rose for helpful discussions. 
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Definitive evidence is presented for the bifunctional 
nature of the biotin repressor protein which possesses 
both regulatory and enzymatic activities. The repressor 
protein can activate biotin in the presence of ATP to 
form biotinyl-5'-adenyiate, the co-repressor which re- 
mains tightly bound to the repressor protein. This com- 
plex can either bind to the operator site and inhibit 
transcription or transfer the biotinyl moiety to a lysine 
residue of the apoenzyme of acetyl-CoA carboxylase. 
The two activities were coincident throughout a puri- 
fication procedure which resulted in a 3500-fold in- 
crease in activity. Gel electrophoresis of the purified 
preparation, under native or denaturing conditions, 
showed three proteins with the activity corresponding 
to the major protein band of apparent M r = 34,000. On 
gel exclusion chromatography, the activity was also 
associated with a protein of M T varying from 37,000- 
44,000, indicating the protein is monomeric. Hie occa- 
sional appearance of multiple bands with biological 
activity in the native gels suggests that the repressor 
protein can also exist in multimeric forms. On chro- 
matofocusing, the repressor activity and the holoen- 
zyme synthetase activity were coincidental, with the 
peak of activity at pH 7.2, the isoelectric point. Only 
a single protein band with M r » 34,000 was observed 
on SDS gel electrophoresis of all fractions showing 
activity. 



Regulatory mutants of the biotin operon, bwR, have been 
isolated by Eisenberg (1) on the basis of their resistance to 
the biotin analogue, a-dehydrobiotin, a growth inhibitor. The 
enzymes of the biotin operon are fully derepressed in these 
mutants, which also exhibit high biotin excretion. Similar 
mutants have also been isolated by Pad (2) on the basis of the 
latter property. Co-transductional analysis by both groups has 
placed bioR at min 89, near the bfe locus, on the Escherichia 
coli genetic map. A rifampicin-dominant transducing phage, 
\drif* isolated by Kirschbaum and Konrad (3), has been found 
by transductional analysis to carry the bioH gene and has 
been used to amplify the bioR gene product, the biotin re- 
pressor protein. With an in vitro protein-synthesizing system, 
Prakash and Eisenberg (4) have been able to demonstrate the 
repression of the synthesis of two of the biotin biosynthetic 
enzymes when the system was fortified with both the partially 
purified repressor protein and biotin. In a subsequent study of 

* This investigation was supported by Public Health Service Grant 
AM- 14450 from the National Institute of Arthritis, Diabetes, and 
Digestive and Kidney Diseases. The costs of publication of this article 
were defrayed in part by the payment of page charges. This article 
must therefore be hereby marked "advertisement** in accordance 
with 18 U.S.C. Section 1734 solely to indicate this fact. 



the interaction of the repressor protein-biotin complex with 
the operator site of the biotin operon, these investigators 
concluded that the true co-repressor is biotinyl-5'-adenylate. 
The reason that biotin could function in the in vitro system 
was because the partially purified repressor protein prepara- 
tion could also activate biotin in the presence of ATP to form 
biotinyi-5'-adenyiate (5). No definitive conclusion could be 
reached whether the repressor activity and the enzymatic 
activity are properties of a single protein. 

A bir mutant, isolated by Campbell et at (6), appears 
similar to the bioR mutants in that the enzymes of the biotin 
operon are derepressed. This mutant, in contrast, requires 
high biotin concentrations for growth and shows decreased 
permeability to biotin. The suggestion was made that a single 
protein may be responsible for this pleiotropic phenotype. 
Although initial mapping placed the mutation near thi, a 
subsequent analysis by Pai and Yau (7) indicated that bir and 
bioR probably map at the same locus. In an extensive bio- 
chemical and genetic analysis, Barker and co-workers (8-10) 
have determined that bir is the structural gene for the enzyme, 
biotin holoenzyme synthetase. This enzyme activates biotin 
in the presence of ATP and links it covalently to a lysine 
moiety of acetyl-CoA carboxylase, the predominant biotin 
enzyme of E. coli. Complementation analysis disclosed the 
complete or nearly complete overlap of the bir and birR genes, 
suggesting their possible identity. In addition, these investi- 
gators have been able to demonstrate that a partially purified 
preparation of the holoenzyme synthetase protein would pro- 
tect the operator site from the Taql restriction enzyme, but 
only in the presence of biotin-5'-adenylate. Thus, the evidence 
from both laboratories suggests a bifunctional protein with 
both regulatory and enzymatic properties. 

The present study reports the isolation of a homogeneous 
repressor protein. We have demonstrated that a single protein 
possesses both regulatory and enzymatic activities: a unique 
property among repressor proteins thus far isolated. 

MATERIALS AND METHODS 1 
RESULTS 

A summary of the purification procedure, shown in Table 
I, indicates about a 3500-fold purification of the repressor 
protein. This actually represents a 24,000-30,000 increase in 

1 Portions of this paper {including "Materials and Methods" and 
Figs. 1-6) are presented in miniprint at the end of this paper. Mini- 
print is easily read with the aid of a standard magnifying glass. Full 
size photocopies are available from the Journal of Biological Chem- 
istry, 9650 Rockville Pike, Bethesda, MD 20814. Request Document 
No. 82M-1623, cite authors, and include a check or money order for 
$4.40 per set of photocopies. Full size photocopies are also included in 
the microfilm edition of the Journal that is available from Waverly 
Press. 
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He have recently screened a tomato leaf cDNA expression library in the 
phage vector Charon 16 (1) with antibodies made against chloroplast 
membrane proteins. The screening procedure utilized secondary antibodies 
conjugated to bio tin and alkaline phosphatase (AP) conjugated to avidin 
(ABC reagent, Vector Laboratories) . The nucleotide sequence of one of the 
clones obtained and the protein sequence of its longest, but still 
incomplete, reading frame (Fig. 1) showed no homology to the sequences of 
interest. A search of protein data banks revealed homology to several 
biotin-containing proteins from both animals and bacteria, and especially 
strong homology with the biotinyl subunit of the transcarboxylase of 
Proplonibacterlvm (2) . The biotin binding site of the latter, 
GQTVLVLEAMKME, differs by a single residue from a similar sequence in the 
tomato protein. Direct screening of phage plaques with avidin-AP in the 
absence of both primary and secondary antibodies subsequently indicated 
that. this tomato polypeptide directly binds avidin-AP . Our results 
identify one source of "false" positives in screening cDNA expression 
libraries with an avidin-biotin based detection system. The tomato cDNA 
clone, which appears to be the only one identified to date encoding a plant 
biot in-binding protein, is available upon request. 

GTVVAPMVGLEVKVLVKDGEKVQ 
GGTACTGTGGTTGCACCTATGGTTGGGTTAGAGGTTAAAGTATTGGTGA 7 0 



EGQPVLVLEAMKMEHVVKAPANGY 
AGGGACAACCTGTGTTAGTATTAGAAGCAATGAAGATGGAGC^ 140 

VSGLEXKV. GQSVQDGXKX.FALKD 
TGTAAGCGGGCTTGAAATCAAAGTGGGCCAATCGGTCCAAG 210 

TGAAATATATCCTGAGGCTATGACAACATCATTCTAG 280 

ACATCCACTAGGGATAAGAATAACAACATTGAGATCTAA 350 

AATACATTTACTTGTAAATGACTTTCCAGACTCAn 38 4 

rlgnxe l: Nucleotide sequence of the tomato cDNA clone and sequence of the 
encoded polypeptide. The putative biot in-binding site is over lined. 
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Abstract 

A cDNA clone GmPM4 which encodes mRNA species in mature or dry soybean seeds was characterized. DNA 
sequence analysis shows that the deduced polypeptides have a molecular mass of 68 kDa. GmPM4 proteins have a 
relatively high amino acid sequence homology with a major biotinylated protein isolated from pea seeds, SBP65, 
but both of these proteins differ markedly from that of presently known biotin enzymes. The accumulation of 
GmPM4 mRNA is detectable in the leaf primodium and the vascular tissues of the hypocoty 1-radicle axis of mature 
seeds, and the GmPM4 proteins are present at high levels in dry and mature soybean seeds, but not in fresh 
immature seeds. It degrades rapidly at the early stage of seed germination. These proteins are boiling-soluble 
and biotinylated when they are present endogenously in soybean seeds; however, the same recombinant protein 
expressed in Escherichia coli is boiling-soluble, but it is not biotinylated. 



Introduction 

Late maturation of seeds is marked not only by water 
loss but also by a drastic change in profile of proteins 
synthesized [e.g. 31], Proteins synthesized during this 
late stage, which are correlated with desiccation toler- 
ance [1], ABA content [25], or transition to seedling 
growth [31], are often termed maturation proteins [31] 
or late embryogenesis-abundant (LEA) proteins [18]. 
Maturation proteins are slightly different from LEA 
proteins in that the messages for maturation proteins 
are not necessarily present at a relatively high level 
as LEA message during late embryogenesis. Based on 
the commonly shared amino acid sequence domains, 
LEA proteins are grouped into three or four groups 
[8, .9]. Virtually all of the LEA proteins are highly 
hydrophilic, contain no Cys or Trp residues, and are 
boiling-soluble [8, 9]. It has been hypothesized that 



The nucleotide sequence data reported will appear in the 
EMBL, GenBank and DDBJ Nucleotide Sequence Databases under 
the accession number U59626. 



LEA proteins may play a protective role in plant cell 
under various stress conditions and this protective role 
may be essential for the survival of the plant under 
extreme stress conditions [9, 37]. 

We have isolated a number of cDNA clones of 
soybean seed maturation proteins from a pod-dried 
seed cDNA library by differential screening [23, 24]. 
These are designated GmPM clones, denoting for 
Glycine max physiologically mature. pGmPMl [5] 
and pGmPM9 [22] were found as a member of the 
group 4 LEA family, and pGmPM2 [19], pGmPM8 
and pGmPMlO [20] group 3 LEA family. 

In the present study, we report the identification 
and characterization of a novel soybean seed matura- 
tion protein and its cDNA clone, GmPM4. We show 
that the GmPM4 protein is biotinylated. We investi- 
gated the cellular and tissue localization of the RNA 
transcripts and the expression of this protein during 
seed development and germination. We also charac- 
terized the biochemical behaviors of the recombinant 
plant GmPM4 proteins in E. colL 
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Materials and methods 

Plant materials 

Soybean {Glycine max L. cv. Shi-shi) seeds were 
kindly provided by the Kaohsiung Agricultural Exper- 
imental Station, Pintong, Taiwan. Plants were grown 
to maturity in a field environment Pods were har- 
vested at mid-development, about 35 days after flow- 
ering (DAF), and seeds were precociously matured by 
air-drying the intact pods (pod-dried, PD) for several 
days as indicated in each experiment [23, 24]. The 
fresh weight, dry weight, moisture content and ger- 
mination behavior of the soybean seeds throughout 
the course of development, from 21 DAF to 70 DAF, 
were studied by Rosenberg and Rinne [31]. Germina- 
tion occurred between 12 h (0% of seed germination) 
and 24 h (ca. 90% of seed germination) after sow- 
ing. Germinating seeds were harvested certain time 
after imbibition as indicated in each experiment. Soy- 
bean seedlings with four compound leaves were also 
used in the studies. Leaves of control plants or those 
treated with 2% PEG 6000 for 5 days were harvested. 
All plant materials were frozen in liquid nitrogen and 
stored at —70 °C until use. 

Isolation ofcDNA clones, cDNA sequencing and 
molecular analysis 

A soybean dried-seed cDNA library was constructed 
and a series of GmPM cDNA clones were selected 
by differential screening as previously described [24]. 
The nucleotide sequence of pGmPM4 was determined 
with the dideoxy chain termination method [33] using 
Sequenase (United States Biochemicals), and the data 
were analyzed using the Genetics Computer Group 
Sequences Analysis software package Version 9.0 [7]. 

Protein extraction and analysis 

Total soluble proteins of soybean seeds or E. coli were 
extracted by homogenizing in ice-cold buffer A con- 
sisting of 63 mM Tris-HCl pH 7.8, 20 mM MgCl 2 
and 1 mM PMSF (phenylmethylsulonyl fluoride). Af- 
ter homogenization, an equivalent volume of Laemmli 
protein solubilization buffer [28] was added. The sam- 
ple was incubated at 100 °C for 5 min before gel 
loading. 

Boiling-soluble proteins of soybean seeds or 
E. coli were extracted by homogenizing in ice-cold 
grinding buffer consisting of 20 mM TES/KOH (pH 
8.0) and 500 mM NaCl. The slurry was transferred to 



a centrifuge tube and incubated at 100 °C for 10 min 
and then at 4°C for 10 min. After centrifugation, 
the proteins in supernatant were concentrated by ace- 
tone precipitation. The precipitate was resuspended in 
Laemmli protein solubilization buffer, and the slurry 
was then incubated at 100 °C for 5 min before gel 
loading. 

Characterization of GmPM4 cDNA clone was per- 
formed by hybrid select translation. Fifty \ig of plas- 
mid DNA was denatured and spotted onto nitrocellu- 
lose filters. The filters were hybridized with poly(A) 
RNA prepared from the cotyledons of 4-day PD, 
35 DAF soybean seeds. The RNA sample hybridized 
to the filters was translated in vitro using a rabbit retic- 
ulocyte lysate system (Promega Biotec, USA) in the 
presence of 35 S-Met 

All protein samples were separated by one- 
dimensional 12.5% SDS-pblyacrylamide gel elec- 
trophoresis, and detected by Coomassie blue staining, 
fluorography or western blot. Biotinylated polypep- 
tides were revealed by AP-conjugated streptavidin 
and the chromogenic substrate nitroblue tetrazolium. 
For comparison, the sera against 130 kDa [24] and 
GmPM8 [20] seed maturation proteins and against 
seed storage protein glycinin were also used in western 
blot analysis. 

Construction of recombinant plasmid and induction 
of protein expression 

A 2.0 kb Ndel-EcoEl fragment of pGmPM4 was lig- 
ated to the Ndel-Eco91 fragment of the pET-24a T7 
expression vector (Novagen, USA). Transformation of 
E. coli, preparation of plasmid DNA, and other routine 
procedures were performed according to the protocols 
of manufacturers and of Sambrook et al [32], Cells 
were cultured in LB medium with a supplement of 
4 u,M biotin. Expression of the GmPM4 recombinant 
protein was induced by the addition of 1 mM IPTG. 

mRNA in situ hybridization 

Four-day PD soybean seeds were harvested and im- 
mediately fixed with 4% formaldehyde solution. Tis- 
sues were fixed and paraffin-embedded according to 
the procedures described by Jackson [26]. The slides 
were hybridized with digoxigenin-labeled antisense 
or sense strands of GmPM4 probes transcribed by 
T3 or T7 RNA polymerase from linearized pBlue- 
script SK- harboring the GmPM4 cDNA. Labeling 
with digoxigenin was performed according to in- 
structions provided by the manufacturer (Boehringer 
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Mannheimm Germany). Hybridization of the probes 
to the slides was carried out at 50 °C overnight, and an 
anti-digoxigenin-AP conjugate (Boehringer) and the 
color substrates were used for detection. 

In vitro biotinylation of the recombinant proteins 

About 55 DAF pea (Pisum sativum) seeds were used 
in an in vitro biotinylation reaction of the GmPM4 
recombinant protein. These seeds had a mean fresh 
weight of 450 mg and a water content of 55%. They 
were ground in buffer A (Tris buffer) or buffer B con- 
taining 50 raM HEPES pH 7.4 , 1 mM PMSF, 1 mM 
EDTA, 1 mM DTT and 10% glycerol at 4°C. The 
slurry was centrifuged at 1 1 000 x# and the super- 
natant was collected as the crude enzyme extract. The 
in vitro biotinylation assay was carried out in 100 u-1 
reaction mixture containing 50 mM HEPES pH 7.4, 
2 mM MgCl 2 , 1 mM NADH, 2 mM ATP, 2 jxM biotin, 
total E. coli proteins containing about 5 \Lg recombi- 
nant GmPM4 protein, and 500 \tg of crude enzyme 
extract from pea seeds. The reaction was carried out at 
30 °C for 2 h and stopped by adding gel loading buffer. 
The biotinylation of the proteins was then detected by 
SDS-PAGE and western blot. 



Results 

cDNA cloning and deduced protein sequence analysis 

A hybrid-select translation assay indicated that there 
is only one protein band corresponding to the GmPM4 
cDNA, with an apparent molecular mass of 70 kDa 
(lane a', Figure la). The nucleotide sequence of 
pGmPM4 contains 2134 bp with 60 bp and 142 bp 
at the 5'- and 3'- untranslated region, respectively. 
The putative protein comprises a 643 amino acids 
with a predicted molecular mass of 67 988 Da, similar 
to the result obtained from hybrid select translation. 
The pi value of the deduced GmPM4 protein is 6.1. 
The protein is highly hydrophilic as revealed by the 
hydropathy plot (data not shown) and its preponder- 
ant amounts of charged amino acids although Ala 
and Gly are also abundant. Southern blot analysis of 
the soybean genomic DNA using pGmPM4 insert as 
the probe reveals one to two intensive bands when 
hybridized under conditions of high stringency (data 
hot shown). It is therefore possible that the GmPM4 
protein is encoded by a small gene family. 

Since most of the GmPM clones belong to the 
LEA family, biochemical properties of the GmPM4 
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Figure 1. Total proteins and immunoblot analysis of soybean seeds 
and E. coli cells expressing GmPM4. a. Lane a', fluorogram of 
hybrid-selected translation product using pGmPM4 DNA. Lanes 
a to 1 are Coomassie blue-stained protein profiles. The following 
samples were loaded onto the lanes: total proteins (lane a) and boil- 
ing-soluble proteins (lane b) from fresh 35 DAF soybean seeds; 
total proteins (lane c) and boiling-soluble proteins (lane d) from 
4-day PD 35 DAF soybean seeds; total proteins (lanes e-h) and 
boiling-soluble proteins (lanes i— 1,) from the control E. coli (lanes 
e, i), the recombinant E. coli at 0 h (lanes f, j), 1.5 h (lanes g, 
k), and 3 h (lanes h, 1) after the addition of IPTG to the media. 
The arrows on the left indicate the molecular weight standards, b. 
Biotinyl proteins in the crude extract from soybean seeds or E. coli 
using AP-conjugated streptavidin as a specific reagent Lanes a to 1, 
the same as in Panel a. The arrow on the left indicates the position 
of 70 kDa. 
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Figure 2. Repeated amino acid sequences in GmPM4 protein. The 
positions, repeat length, and number of residues separating the 
repeat are indicated. 
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are compared with those of the LEA proteins. Like 
many LEA proteins, GmPM4 contain large amounts 
of Ala, Arg, Glu, Gly, Lys and Thr. However, in con- 
trast to LEA protein, GmPM4 include one Cys and 
two Tip residues. There are several repeated homolo- 
gous stretches of 8, 10, 11, 12 or 17 amino acids, as 
shown in Figure 2. It can be observed that these repeat 
stretches are separated by several amino acid residues. 
This character does not share the same property of 
the group 3 LEA proteins in which direct, tandem 
repeats of 1 1-mer are found. The homology scan by 
BLAST or TFASTA of the Genetics Computer Group 
Sequence Analysis software package [7] shows a 20 
to 25% identity in the central region with about 250 
amino acid long of GmPM4 protein with several of 
the group 3 LEA proteins, including birch BP8 [29], 
soybean GmPM2 [19], GmPM8 and GmPMlO [20]. 

GmPM4 protein is similar to pea SBP65 

A search for protein sequence homology in data banks 
reveals a strong similarity between the GmPM4 pro- 
tein and the biotinylated protein SBP65 from pea seeds 
[11]. The similarity between the two proteins is 71.5% 
and the identity 54.1%. The SBP65 is a little smaller, 
with only 551 amino acids. This protein has been 
studied intensively by Duval et al [10-14]. It has an 
apparent molecular mass of 65 kDa, does not exhibit 
any of the biotin-containing carboxylase activities and 
it represents a major part of the total protein-bound 
biotin in pea seeds [12]. The protein is localized in 
cytosol [14] and specifically expressed in the late mat- 
urating seeds, and is absent in leaf, root, stem, pod 
and flower tissues of the pea plants [10, 13]. It is pro- 
posed that SBP65 acts as a storage form of biotin to 
support seedling growth during gennination, in which 
a strong demand for biotin-containing carboxylases 
such as acetyl CoA carboxylase is obviously needed 
[12]. 

Figure 1 shows the profiles of total extractable- 
proteins (Panel a) and avidin detectable biotinylated 
proteins (Panel b) in several soybean tissues. For 
detection of biotinylation, a protein blotting and strep- 
tavidin/alkaline phosphatase (AP) system was used. 
Only one biotinylated protein was detected in the 4- 
day PD, 35 DAF soybean seeds, and the apparent 
molecular mass of this protein was found exactly the 
same as the GmPM4 protein, i.e. 70 kDa. Some soy- 
bean biotm-containing carboxylases had been identi- 
fied, purified or the cDNA cloned, and the molecular 
mass of acetyl CoA carboxylase was found to be 58, 




Figure 3. The presence of GmPM4 protein in developing, ma- 
ture, germinating soybean seeds and seedlings. Biotinyl proteins 
in the crude extract from soybean seeds were detected using 
AP-conjugated streptavidin as the specific reagent (Panel a). For 
comparison, antibodies against 130 kDa soybean seed maturation 
protein [24] (Panel b), maturation protein GmPM8 [20] (Panel c), 
and storage protein glycinin (Panel d) were also used. The following 
samples were loaded onto the lanes: total proteins of 35 DAF (lane 
a), 45 DAF (lane b), 55 DAF (lane c), 65 DAF (lane d) and mature 
(M, lane e) soybean seeds, total proteins of 0 day PD (DPD) (lane 
0, 1 day PD (lane g), 2 day PD (lane h), 4 day PD (lane i), 7 day PD 
(lane j) 35 DAF soybean seeds, total proteins of 0 h after imbibition 
(HAI) (lane k), 6 HAI (lane 1), 18 HAI Cane m), 24 HAI (lane n), 36 
HAI (lane o), 48 HAI (lane p) soybean seeds, and total leaf proteins 
from control soybean seedlings with four compound leaves (lane q), 
or those treated with 2% PEG for 5 days (lane r). 

65, and 240 kDa [4], and that of methylcrotonoyl-CoA 
carboxylase was 85 kDa [38]. Hence, the 70 kDa bi- 
otinylated protein is very likely the same as GmPM4 
protein. This protein is present in very low abundance 
in the fresh 35 DAF seeds (lane a) but high in the 4- 
day PD seeds (lane c). This seed protein stays in the 
supernatant after extracts of the total seed proteins are 
boiled Qane b and d, Panel a and b), a property similar 
to most of the soybean seed maturation proteins [1]. 
We hence have characteristically assigned this soy- 
bean GmPM4 protein, together with the pea SBP65 
protein, as biotinylated. 

Expression ofGmPM4 protein in developing and 
germinating soybean seeds 

To evaluate the expression of GmPM4 protein in de- 
veloping and genninating soybean seeds, we extracted 
proteins from artificially dried, naturally mature or 
germinating seeds. Total proteins from young leaves 
of control soybean seedlings or water-stressed, PEG- 
treated seedlings were also used in the study. Figure 3a 
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shows the patterns of biotinylated protein present in 
these tissues; The data show that little or no GmPM4 
protein was detected in young developing seeds (35 
DAF, lane a), and the protein accumulates to a high 
level after 45 DAF (lane b), and remains at a simi- 
lar high level in 65 DAF (lane d) or mature (lane e) 
seeds. When the 35 DAF seeds were artificially dried, 
the GmPM4 protein increased rapidly to a high level 
after only one day PD treatment (lane g), and remained 
at high levels in 2 PD (lane h), 4 PD (lane i) and 
7 PD (lane j) seeds. The levels of another two soy- 
bean seed maturation proteins, 130 kDa [24] (Panel 
b) and GmPM8 [20] (Panel c), and a storage protein 
glycinin (Panel d) in these tissues were also studied. 
These two maturation proteins had the same accumu- 
lation patterns as GmPM4 had, i.e. little or no protein 
was detected in young developing seeds and the levels 
increased in late maturing or artificially dried seeds. 
The same pattern of protein accumulation has been 
reported [1, 31], and is the characteristic of seed 
maturation proteins. For the storage protein glycinin, 
however, the amount was already very high in 35 DAF 
seeds, the youngest one used in the study. After seed 
imbibition, the levels of GmPM4 and 130 kDa pro- 
tein decreased gradually and disappeared 48 h after 
imbibition (HAI, lanes k to p), while the levels of 
GmPM8 protein and glycinin remain relatively high 
at 48 HAI. These four seed proteins all appeared to 
be seed-specific since no protein could be detected 
in leaves (lane q) and other tissues (data not shown). 
Also there was no accumulation of these proteins in 
the leaves of water-stressed seedlings (lanes q and r). 

GmPM4 mRNA accumulates in the vascular system 

In situ mRNA hybridization analysis with digoxigenin- 
labeled riboprobes were conducted using transverse 
sections prepared from 4-days PD 35 DAF soybean 
seeds. We show that high levels of GmPM4 mRNA 
are accumulated at leaf primodium (Figure 4c) and at 
the central stele of hypocotyl-radicle axis (Figure 4b). 
At the vascular bundle, the level of mRNA is high in 
the metaxylem, phloem and slightly lower in pith tis- 
sues, while no signal detected in the protoxylem tissue 
(Figure 4d). GmPM4 mRNA was also not detected in 
the remaining endosperm and the parenchymal cells 
of cotyledon, and very low messages, if any, in the 
cortex of hypocotyl-radicle axis and vascular tissues 
of cotyledon (Figure 4b). Western blot analysis of 
cotyledon and embryonic axis proteins also revealed 
that the level of GmPM4 protein was very high in the 



axis and low in cotyledon (data not shown). Speci- 
ficity of in situ hybridization reactions was confirmed 
by the lack of appreciable reaction of the GmPM4 
sense strand probe with the paraffin-embedded sec- 
tions (Figure 4a). 

Based on these results, we conclude that cell-type- 
specific expression of GmPM4 gene occurs during the 
late maturation stage of soybean seed development. 
Gene expression at the transcription level is high- 
est in the leaf primordia and the vascular system of 
hypocotyl-radicle axis, low in the vascular system of 
cotyledon, and minimal or not detectable in most other 
tissues. 

Expression of recombinant GmPM4 protein in E. coli 

There exists an Ndel site in the multiple cloning 
site of the expression vector pET-24. The restriction 
site of Ndel is CATATG, and this ATG should be 
utilized as the start codon for the recombinant pro- 
tein. There is only one Ndel site in pGmPM4 DNA, 
which is located exactly at the start codon. For the 
expression of GmPM4 protein in E. coli, the 2 kb 
Ndel-EcoRI fragment from pGmPM4 containing the 
whole open reading frame (ORF) was ligated to the 
5.3 kb Ndel-EcdRI fragment of pET-24a(-f). Thus, the 
recombinant GmPM4 protein utilized its own start and 
stop codons for translation, and should have the same 
molecular mass as that in the soybean seeds. Since 
there was no T7.Tag sequence or His.Tag sequence 
in the ORF of the recombinant protein, this protein 
could not be purified using affinity chromatography, 
as could many other recombinant proteins using the 
pET systems. 

Figure 1 illustrates the effective induction of re- 
combinant GmPM4 protein by the application of IPTG 
to the culture medium (lanes e to h, Panel a). The 
apparent molecular mass of the recombinant protein 
is indeed identical to that from soybean seeds, i.e. 
70 kDa as analyzed by SDS-PAGE. This recombi- 
nant protein stays in the extraction solution after being 
boiled (lanes i to 1, Panel a) and is thus boiling-soluble 
as was found for the endogenous protein extracted 
from soybean seeds. However, the recombinant pro- 
teins produced from E. coli were not detectable by 
streptavidin binding assay (lanes f to h, Panel b). 
Therefore, in contract to the biotinylated soybean 
GmPM4 produced endogenously, the recombinant 
GmPM4 protein produced in a heterologous system 
was not biotinylated in E. coli, although a concentra- 
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Figure 4. Localization of GmPM4 transcripts in 4-day PD soybean seeds. In situ hybridization was carried out with transverse sections of 
pod-dried soybean seeds. The development time of the reaction with the color substrates was 25 h. a. Transverse section through the region 
of embryonic axis and two pieces of cotyledon hybridized with the sense probe; bar — 50 u,m. b. Section similar to the one shown in a but 
hybridized with the antisense probe; bar = 50 |xm. c. Cross section of the plumule hybridized with antisense probe; bar = 5 \lvcl d. Higher 
magnification of the central stele in b; bar = 10 u,nx Abbreviations: Ct, cotyledon; En, endosperm; HR, hypocotyl-radical axis; Mx, metaxylem; 
P, pith; Ph, phloem; Px, protoxylem. 



tion of 4 |xM biotin was supplemented to the culture 
medium. 

To test whether the E. coli GmPM4 recombinant 
proteins can be biotinylated by crude seed extracts, 
we have employed an in vitro assay system, where a 
crude extract was prepared from late developing pea 
seeds. Since pea SBP65 and soybean GmPM4 pro- 
teins have different apparent molecular mass, i.e. 65 
and 70 kDa, respectively, we contemplated that this 
difference may allow us to interpret the test results. 
NADH, ATP, and biotin were used as energy sources 
and the substrates in reaction buffer were described in 
Materials and methods. Figure 5 shows that no 70 kDa 
biotinylated protein was detected after incubating the 
recombinant proteins with pea crude extracts for two 
hours (lanes f and h, Figure 5b). Thus, the GmPM4 re- 
combinant protein obviously could not be biotinylated 
in the in vitro assay employed. 

Discussion 

A new seed maturation protein group 

The developmental pattern of biotinylated proteins 
(BP) during embryogenesis, maturation and germina- 
tion of soybean seeds was characterized by Shatters 
and his colleagues [1,6, 35]. There are three of these 
proteins, with an apparent molecular mass of 85 kDa 
(BP85), 75 kDa (BP75), and 35 kDa (BP35), respec- 
tively. These authors predicted that the BP75 might be 
related to the pea SBP65. Due to the similarities in ap- 
parent molecular weight, tissue specificity and protein 
accumulation patterns, the GmPM4 and BP75 are the 
same protein. 

GmPM4, BP75 and SBP65 proteins were all seed- 
specific biotinylated proteins present in the embryonic 
axes and the cotyledons. Only traces of these pro- 
teins, if any, were detected at early stages of seed 
development. The accumulation of these proteins were 
induced by artificial drying of the immature seeds or 
by natural maturation which water loss also occurred. 
After seed imbibition, these proteins disappeared at 
early stage of germination. These proteins were not 
present in any other tissues examined including leaves, 
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Figure 5. In vitro biotinylation assay of GmPM4 recombinant pro- 
teins, a. Lanes a to i are Coomassie blue-stained protein profiles. 
The following samples were loaded onto the lanes: total proteins of 
fresh (lane a), 4-day PD (lane b) 35 DAF soybean seeds, total pro- 
teins of late maturing pea seeds extracted with buffer B (lane c) or 
buffer A (lane d), crude E. coli extract expressing GmPM4 (lane i), 
biotinylation assay including crude E. coli extract, crude pea extract 
(buffer B), and without (lane e) or with (lane f) ATP and NADH, 
biotinylation assay including crude E. coli extract, crude pea extract 
(buffer A), and without (lane g) or with (lane h) ATP and NADH. 
Arrows on the left indicate the molecular weight standards, b. Bi- 
otinyl proteins detected with AP-conjugated streptavidin. Lanes a to 
i, the same as in Panel a. Arrows on the left indicate the positions of 
65 kDa (pea SBP65) and 70 kDa (soybean GmPM4). 



stems or roots. The protein expression patterns were 
the same as those of seed maturation proteins, and 
were quite different from other biotin-containing car- 
boxylase. 

All the GmPM cDNA clones were pulled out by 
differential screening. Most of the GmPM clones se- 
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quenced fulfil the properties of LEA proteins, as 
reviewed in the Introduction. These GmPM clones 
represent relatively high abundance of the messages 
expressed during late seed mature stage. Out of these 
456 positive plaques selected as highly expressed mes- 
sages in pod-dried soybean seeds, the GmPM6/7 and 
GmPM 1/9 families accounted for 100 and 46 clones, 
respectively [24]. However, only 4 of them were 
GmPM4 clones. The GmPM4 message is therefore 
much less abundant than many LEA messages. 

Several characteristics of the GmPM4 and SBP65 
are quite similar to those of LEA proteins, including 
their being highly hydrophilic, boiling-soluble, and 
having many A, D, G, K, R and T residues. How- 
ever, GmPM4 contain 1 Cys and 2 Tip residues and 
SBP65 contain 2 Cys and 2 Trp residues. This is quite 
different from the general features of LEA proteins. 
The amino acid sequences of GmPM4 or SBP65 do 
not show similarity with group 1, 2 or 4 T EA pro- 
teins. Although there are several repeated homologous 
stretches in both proteins, they are not tandemly- 
repeating 1 1-mers as LEA 3 proteins are. The message 
of GmPM4 is less abundant, as indicated previously. 
According to the points mentioned above, soybean 
GmPM4 and pea SBP65 do not fulfil totally the char- 
acteristics of LEA proteins. Nevertheless, they are a 
new group of seed maturation proteins, and act as bi- 
otin storage pool to support early growth during seed 
gennination. 

Biotinylated proteins 

The coenzyme biotin (vitamin H) is a carrier of 
carbon dioxide in enzymatic carboxylation and tran- 
scarboxylation reactions [e.g. 27], It is known that 
there exists a single biotinylated enzyme in E. coli, 
three in yeast, and four in animal or plant systems 
[3, 15, 30, 39]. The four biotinylated enzymes in 
plants have been identified as acetyl-CoA carboxy- 
lase, 3-methylcrotonoyl-CoA carboxylase, propionyl- 
CoA carboxylase and pyruvate carboxylase. Several 
of these carboxylase genes from E. coli y yeast, ani- 
mal and plant had been cloned and sequenced. By the 
analysis of biotinylated domain and the alignments of 
the predicted sequences, it was suggested that the bi- 
otin is covalently linked to the epsilon amino group 
of a specific lysine residue located within a highly 
conserved region of Met-Lys-Met. 

Pea SBP65 contains no carboxylase activity, and 
the analysis of the biotinyl peptide indicated that the 
biotinyl lysine was located at position 103 [11]. Fig- 
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Figure 6". Comparisons of the biotinylation domains of biotin en- 
zymes, pea SBP65 and soybean GmPM4. The consensus sequence 
of biotin enzymes was obtained by alignment of several biotin car- 
boxylases from E. coli, yeast, animal and plant The biotinylated 
lysine (K) is printed in bold. The double Glys at both ends are 
underlined 

ure 6 illustrates the biotinylation domains of biotin 
enzymes, SBP65 and GmPM4. These regions are 
highly conserved for SBP65 and GmPM4, with double 
Glys at both ends, three acidic and three basic amino 
acids in between, and the biotinyl Lys at the center 
of the conserved motif. The biotinylation of protein 
is an ATP-dependent process in which biotinyl- AMP 
is synthesized as an intermediate [6]. The double Gly 
residues might play a very important role for the struc- 
ture and function of GmPM4 and SBP65 proteins. Pre- 
vious studies have shown that several glycinyl residues 
are close to a reactive lysinyl residue in a motif in- 
volved in nucleotide-binding of several proteins. For 
instances, a segment of Gly-Gly-Pro-Gly-Ser-Gly-Lys 
containing the reactive Lys is important for binding of 
the substrates in adenylate kinase [40], and a segment 
of Lys-Thr-Gly-Gly-Leu is the active site for starch 
synthase [17]. 

Figure 6 shows that only the biotinylated Lys 
and one Val are conserved comparing the GmPM4 
and SBP65 with the consensus sequences of biotin- 
containing enzymes. Several biotinylated carboxy- 
lases from animal or plant, e.g. soybean 3- 
methylcrotonoyl-CoA carboxylase, were expressed in 
E. coli y and the resulting chimeric protein was biotiny- 
lated [38]. In our study, however, the recombinant, 
£. c<?/f'-produced GmPM4 protein was not biotiny- 
lated (Figure 1). Thus, although the biotin-carrying 
domain of biotin-containing carboxylase is conserved 
throughout all kingdoms, the domain of SBP65 and 
GmPM4 proteins appears to be unique and could not 
be recognized by a prokaryotic biotinylation system. 

Possible roles ofGmPM4 proteins in soybean seed 
development 

It has been suggested that the presence of soybean seed 
maturation proteins might be a prerequisite for normal 
germination and the subsequent seedling growth [21, 
31]. Using pea as a model system, Duval et aL [9] 
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demonstrated that there was a very high demand for 
biotin in early seedling growth since the activities of 
acetyl CoA carboxylase and 3-methylcrotonoyl-CoA 
carboxylase increased strongly in the seedlings 3 to 
8 days after imbibition. Since the GmPM4 proteins 
do not appear to fulfil the biochemical properties of 
LEA proteins, their messages are not very abundant 
in mature seeds and will not express in water-stressed 
seedlings, we suggest that the physiological roles of 
GmPM4 protein might differ from those of the LEA 
proteins, i.e. desiccation protection. Instead, these 
proteins are required for the subsequent normal germi- 
nation since they would supply an essential cofactor, 
biotin, for seedling growth. 

The GmPM4 message is tissue-specific and local- 
ized to leaf primodium or the vascular tissues of the 
cotyledon and the hypocotyl-radicle axis. The reserve 
hydrolysis in germinating seeds has specific patterns; 
for instance, the mobilization begins around vascular 
strands for germinating soybean seeds [2]. The lo- 
calization of GmPM4 proteins and the initiation site 
of reserve utilization coincide well and indicate that 
the possible physiological roles for the biotin ligands 
in GmPM4 proteins might be utilized for relating 
carboxylases during early stage of seed germination. 
The importance of biotin in seed development was 
also demonstrated by the identification and character- 
ization of a biotin auxotroph mutant of Arabidopsis 
[34]. No biotin was detectable in the seeds of this 
embryo-lethal mutant, but these embryos could be 
rescued when grown in the presence of biotin [36]. 
The presence of GmPM4 protein in mature soybean 
seeds might be essential to function as a source of 
biotin for several growth-limiting enzymes that are 
necessary during seed development and the subse- 
quent germination stages, and thus may play some 
roles in determining seed germination capacity. 
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Using a vi din cONA as a hybridisation probe, we detected a gene family whows putative products 
are related to the chicken egg-white avidin. Two overlapping genomic clone* were found to contain 
five genes (avidirwekied gene* 1 — 5, cvr.)—ayr5) t which have been cloned, characterized and se- 
quenced. AJl of the genes have a four-exon structure with an overall identity with the avidin cDNA 
of. 88 -92%. The genes appear to have no pseudagenlc features and, in fact, two of these genes 
hav4 been shown to be transcribed. The pulaiive proteins shore a sequence identity of 68— 7%% " 
"vith avidin, The amino acid residues resptjn&ible for the biotln-binding activity of avidin and the 
bacterial biotin-binding protein, streplaWdin, are highly conserved. Since avidin Is Induced la both 
a progesterone-specific manner and in connection with inOammation. these genes offer a. valuable 
tool to study complex gene regulation in vivo. 



Avidin is a basic glycoprotein found in avian, reptilian 
and amphibian egg white (Hertz and Sebrcll, 1942: Jones 
and Briggs, 1962; Korpela ct al.. 1931)- It is composed of 
four identical subunits* each consisting of 12S amino acids* 
of known sequence (DcLange and Huang. 1971). Avidin is 
capable of binding the vitamin bio tin with an exceptionally 
tight non-covalent bond (K* « 10- 15 Mi Green, 1975X sug- 
gesting that it may function a* an antibiotic protein inhibiting 
bacterial growth. 

Avidin Is induced ;n the oviduct of the estrogen-pre- 
treated chicle by a single steroid hormone, progesterone 
(O'Malley ct al., 1969; Tuohirnaa et *]., 19&9). Thi& proges- 
terone induction requires raKNA synthesis and is followed 
by a parallel increase In both the avidin mkNA and protein 
amounts suggesting that the Induction occurs mainly at the 
transcriptional level, In uddiiioh, avidin ia induced in connec- 
tion with tissue uainra (Heinonen ct aL, 197$) and toxic 
agenta (Elo et al., 1975; Heinonen and Tuohirnaa, 1978) as 
wclL as a consequence of heat injury (Elo, 1980) and bacterial 
or vjnd infections (Elo et sL, 1980; Korpela et al.. 19S2; 
Korpela et al., 1983). This expression is detected in all 
chictcn tissue* tested except the brain. Since the inflammali- 
ou-awmtad induction needs no estrogen prerxeauneiir and 
can be abolished by antiinflammatory drug* (Nordback et 
al., 1932; NicmeU, 198$), the product i* callod 'inflammn- 

ComrpontltAce to M. S, Kulomaat Uuivortiry of Jyvosfcytt, 
Department of Biology. P. O. Box 35, SM0351 Jyv*L<kyU, Finland 
■Fax: +11% 41 602 221. 

Abbmi&tioAs. avr, avldin-rclatcd gene; AVR, avidin-related 
prCitein; PCR, polymerase chain ruction. 

Note. Tbe nucleotide sequence data reported it this article have 
been submitted xo the EMBL datu bank and are availubla under the 
accession nmnbcp; Z21611, Z21554, 221612, 222883 and Z228S2 
for uvrJ—avr5. respectively. The first two authors contributed 
equally to this auidy. 



tion avidin'. Altogether, avidin seeim to have at least two 
partially different induction mechanisms (Tuohirnaa ct al., 
19X9), 

In order to isolate the avidin gene, the cDNA wai cloned 
(Gopc ct al., 1937). This cDNA was used as a hybridisation 
probe in the isolation of a genomic done (^gAVl, previously 
called >l g AVl2201 ; Kciniinen et al, 1988). The AgAVl clone 
contains three genes closely related to the avidin cDNA. We 
call the.se? genes avidin-related genes 1—3, dvr]—avr3 
(referred to as pgAV1.8, pgAV3.7 and pgAV3.3 in Keinanen 
et al., since they are unable to encode the known pop- 

tide sequence of egg-white nvidin (DeLanga and Huang* 
1971), In this study, we report molecular cloning of two more 
members of the chickea avidin gene family and the nucleo- 
tide sequence of all live avidin-related genes. 

MATERIALS AND METHODS 
Materials 

Restriction and modifying enzymes were purchased from 
New England Biolabs, Boehringcr-Mannhcdni and Promega. 
The Sequenase version 2.0 ur Sequence version t.O DNA 
sequencing kit waa a product of United States Bitichen:iccl$. 
The Auu>Rcad fluorea»cent sequencing kit and the Fluoxc- 
Prime kit were purchased from Pharmacia. Nylon filter 
(HybcmcJ-N, 0.45-tun poie iji^e) wag obcained from Amer- 
aham Intemadonal and [a- ,3 S]dATP, [<r- M PJdTTP awi [a- 
3 -P]dCTP (3000 CiAnmol) were obtained from NEN Re- 
search Products. Common laboratory, reagents were from 
Sigma, Merck or J. T. Baker. 

Construction and screening of a chicken oviduct genomic 
DNA library 

A chicken genomic library was constrncted at previously 
described (Klemiek et al, 1986) and was kindly provided by 
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Prof. B. W. O'Maliey. Briefly, cfaiclceo oviduct DNA was 
partiaJly digested by Mbvl 16-24-kb fragments were sejw- 
. rarcdand ligptcd into rhe itemHI-digcsled bacteriophage de- 
rivative EMBL4 vector before packaging in vitro. Genomic 
clones were screened using nkk translated (Rigby et aL 
1977) ,2 P-Iabellcd chicken avidin cDNa (Qope et ai., 1987) 
probes, cither the fu!)-lea«tb cDNA or its 3' end (397 bp; 
positions 198-594), in in siru plague hybridisation (Benton 
and Davis, 1977; Keiniinen et al., 198*). The Escherichia 
toll NM539 strain whs used to provide host cells. 

Characterization of the genomic done* 

Recombinant DNA from the genomic clone AgAV2 was 
isolated far characterization according to Benson and "Taylor 
(1984). The insert w«* subcloned (Maniatfa et a!., 1982) into 
the pBR322 pLasroid and an M13 sequencing vector after 
tiiwSUI digestion. Recombinant pi as mid s were introduced 
into competent £. coli RR1 cell* (Hanahan. 1983). 

Plasudd DNA was isolated ficoin the trtnftfonnants by an 
alkaline-detergent method (Birnboira and Doly, 1979) and 
digested with /f/ndiu. After transfer of DNA onto a nylon 
filter (Southern, 1975)» the inserts were characterized by 
DNA hybridisation analysis. For hybridisations, 20XNaCl/ 
Cit wa* prepared containing 3.0 M NsCI, 03 M sodium 
citrate, pH 7.0. Reagents for 50xDcnharaVs solution were 
5 g Fkoll (type 400: Pharmacia), 6 g polyvinylpyrrolidone), 
5 g bovine serum albumin (fraction 5; Sigma) and H z O to 
50C mi. Prehybridimion was performed in 6XNaOi/Cit/ 
lOXDenhardt's solution/0. 1% SDS/50^ig/nil denatured her- 
ring sperm DNA at 68 C C for at least 4 h with hybridisation 
overrughi at 6S°C in an identical solution containing the 
denatured ^P-labeiied cDNA probe [full-length probe, or the 
S'-end (position* 1-197), or rhe 3'-eud (positions 198-569) 
of the cDNAJ. Pilcera were wmbed et 68°C with 2xNaCl/ 
Cii/0.1% SDS (3X5 min and 3X30 min) and with 
aiXNetCJyCii/0.1% SDS (30 min) before autoradiography. 
In addition, subclones giving positive hybridisation signals 
were analysed by restriction enzyme mapping and sequenc- 
ing. 

For genomic blotting, the chrorao^ornat DNA was iso- 
lated from chicken muscle (Strauss, 19K8) and digested with 
EcoKL, BatnHl or HvvSUI. DNA was analysed on an 0.8 % 
agarose gel. The DNA hybridisation analysi* was performed 
as described above using S2 P-3abelled full-length cDNA as a 
probe. . 

Polymerase chain reaction amplification 
and cloning of the avr3 gene 

To obtain tho 5' end of the <rvri gene, primers specific 
for the avr genes aud the avidin cDNA {corrcuponding to 
nucleotide positions 202-222 and 1302-1321 in Fig. 3) 
were used to amplify the genet from chicken genomic DNA, 
As :einplnLe,' 100 eg DNA was used in 5Q-ul reaction* with 
buffer A (final concentration 50 inM KCT, 1 0 mM Tris/HO, 
pH 9.0, 25°C with 1 .5 mM MgCl a , 0.01 % gelatin, 0.1 % Tri- 
ton X-100), 50 pmol both otimeiR, 200 umol of each dNTP 
and 2,5 U Tag DNA polymerase ovcrlayed with 100 uJ 
mineral oil. Amplification reactions took place in a thermal 
cycler (PTC-100 programmable thermal controller. MJ Re- 
search). Thirty cycles of amplification (each, cycle including 
denaturodon at 94°C for 1 min, primer annealing at 35 °C for 
1 min and polymerisation at 72°C for 1.5 min) were per- 
formed. 



The polymentse^chain-rcaction (PCR) products ware 
purified from NuSieve GTG agarose (FMC BioProducts) 
using Magic PCR Preps (Promesa) and cionsd using a TA 
Cloning System version 13 (Invitrogpn). Plasmid DNA was 
isolated with Magic Minipreps (Fromega). Hie clones for 
sequencing were selected by restriction enzyme analysis and 
large-scale isolation was performed with Magic Maxipr^ps 
(Promega). 

DNA sequencing 

The .nucleotide sequence was determined by the dideoxy- 
aucleotide ciiinn-lcrirunation sequencing method (Sanger et 
al M 1977) using modified T7 DNA polymerase. The Klenow 
fragment of DNA polymerase 1 was also used for sequencing 
of avr]. Double-stranded sequencing of the PCR-generated 
fragment of avr3 was performed following the instructions 
of the AutoRead fluorescent sequencing kit (Pharmacia). 
Analysis of ihe reaction products wis made using a Phar- 
macia ALF automated DNA sequencer (Ansorge. ec aL 
1 937). 'The primers were either M1 3 universal or avr-spcciflc 
primers synihesised with a Cyclone DNA synthesiser (Milli- 
geo/Bioijcarch). For automated sequencing. Ml 3 universal 
primers or avr-specific primers were syuthesiwd with the aid 
of an Applied Biosysiems 381 A DNA synthesiser and floo- 
rescence-labeHcd by the PluorePrime Jrit from Pharmacia. All 
sequences were dewmiined from bom strands. 

Data analysis 

Sequence data were analysed using programs of the Ge- 
netics Computer Group package, version 7.0 (University of 
Wisconsin Genetics Comparer Group, Madron, WI, USA) 
installed on a Convex C3 840 computer (Devereux ct al, 
1984). 



RESULTS 

Based on the hybridisation analysis of the chicken geno- 
mic DNA, a single copy gene for avidin was suggested 
(Cope et al.* 19ST). Since three different genes closely re- 
lated to the avidin cDNA bad already been cloned (Keiniinen 
et al.. 1988), the hybridisation analysis of genomic DNA was 
repeated (Kg. 1). Instead of the two bands observed pre- 
viously (Cope et aL, 1987), three bands were detected in the 
tfutdin-digested genomic DNA using the foil-length avidia 
cDNA as a probe. Tb» largest DNA fragment was estimated 
to be 4.5—5.0 kb. Hybridisation signals obtained from tha 
3.-5— 4.0-Jcb and 2.7— 2.9-lcb fragments were mare intense 
than signals obtained from the largest fragment, thus suggest- 
ing the presence of more than one prwiti vc fragment of nearly 
equal »ize in both bands, The result confirmed me presence 
of multiple gertcK for nvidin. Tn ortkr to isolate the entire 
avidin gene family, screening of the genomic library was 
continued. 

The second genomic clone, AgAV2, was detected after 
screening of 3.5X10 6 more clones from the genomic library 
CKleinsek et al.. 1986). By restriction enryme and hybridisa- 
tion analyses the two clone* were found to be partially over- 
lapping. The insert of /gAV2, approximately 20 leb, was di- 
gested with HindlU and subcloned into pBR322. In addition 
to ovri— avrS (Keinanen et al., 1988), two more members of 
the gens family. cvr4 and Qvr5, were isolated. 

Since the insert of AgAVl was subcloned as EcdSd frag- 
ments (Keiniinen et al., 1988), the ovrJ gene containing an 
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Ftg. 1. Hybridlaarlon analyxU of chicken genomic DNA, Chxvrno- 
somal DNA was Isolated from chicken muscle, digested >vitb froRI, 
BaarniU Or fJindUL, elrctrophon«*d oa an O^S** agaroie gel and 
hybridised with the full-length avjfln cDNA probe as described in 
the Material and Method* section (ifrcRI, lane 1; itojiHI, land 2; 
HJmlin. lane 3). Molecular-size markers (MEcoXI, ffinaHI) are 
shewn (bp). 



internal EcpRI site was cul In order to obtain the 5' end 
of the gone, PCR was performed. Primera, corresponding to 
nucleotide positions 202-222 and 1302-1321 in Rg, 3, 
wens used to amplify a fragment of 1.1 kb from the chromo- 
somal DNA. as expected. Btt&ed on restriction e&zynie analy- 
sis of the produced subclones, one of these clones was 
assumed to contain the region upstream of the J?coRI site ot 

Different methods were used to determine the nucleotide 
sequences of the avr geciea. Both strands were sequenced 
at least once. The length, approximately 1.1 kb, and cencrai 
structure of the five genes was simikr (Fij. 2). Although se- 
quences of the avra and avtS gens* arts with a singto excep- 
tion identical* on the basin of restriction enzyme mapping of 
the ftindUl clones (3.9 kb), we consider them to be different 
genes, All five genes contained four putative exoos split by 
rbree introns which follow die GT-AG rule at the predicted 
exoa-intron junctions* Identity between the putative exous 
and the'avidin cDNA was 77—98%. The amino acid se- 
quences deduced from the nucleotide sequences of avrl— 
avrS showed an identity of 68-78% vhen compared with 
avldxa. Men dues of single exons with these of preavidin 
werfc 37-100% (Table 1). 

DISCUSSION 

previous hybridisation Riudies suggested that avidln wa< 
a single-copy geno (Gope et 19??). The nucleotide $e- 
quonce of the five avi&n-relaied genea (Fig* 3) reveals lhat 
all of these cloned genes have a dingle HindUl site close to 
the 3' tail Thus, the idendrlcarion in this study of dace posi- 
tive bands from the ifi/idlH-difle&UJd genomic DNA, two of 
which arc suggested to be m Iceat duplets (Fig. 1). is in keep- 
ing with current knowledge of the avidir gene family- Hy- 
bridiiatxons of the genomic DNA were pcrlomied using the 
V end (384 bp: Gope et Hi., 1&87) and the full-length cDNA 
in this «tudy au a probe, but this does not explain the d&cre- 




avr3 




^Bflow avr5 




Fig. 3. Sequencing strategy for ftvidin-relsnwd gene* 1-5. The 
cxons (■) and sequence* obtoi&ed from the PCR clone containing 
the 5* fijjd of the avrS itctvo (-■») e/e shown. Scqucncaa from the 
genomic subclones arc also indicated. 



table 1. Identity between th* peptide ehadn of prwuvldln end 
putntivb arid in- related proteins. The identities are shewn for each 
ex en and between the mature avidin (without the signal pepdde) and " 
putative avidti>r*latcd proieins. 
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AV1M 


100 


62 


13 


64 


70 


AVR2 


96 


66 


73 


57 


68 


AVR3 


96 


69 


78 


64 


72 


AYR4/3 


96 


6$ 


95 


64 


78 



pancy between the earlier and now reported results. The ge- 
nomic DNA* were, however, of different origin. In the earlier 
experiment, DNA was isolated from chicken spleen and in 
this study DNA was isolated from chicken muscle. It would, 
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Fig. 3. Stqurnco tUS«nm«at'of ft*SdIfVrtUt«d gc«* 1-5 an* aridln cPN4. The hyphens represent tho «»ncspandiJig nucleoli^ for the 
''consensus- sequeooc; nvclc«id«& that dift>r from th* coi««:asus sequence arc *hown in stiull litteis. The hyphens on to consaasuii sequence 
indicate that no consensu* wa» found. Gaps in the saqnancc are indicated (■> The actwrt sequence sizes a« 1335 nucleotides (<xvrj and 
avr2) Uue to a gap m nucleotide posicions 531-536 in tins figu«. 1133 nucWiecteic (tfvri) and 1334 rmclerKides (avr4 and avr5), the latT*r 
" " "duns 531-536 and i>o5ltlon 1011. The sequence of ovr5 startfl at py»Uon 202 and ends at nucleotide 1321. 
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The firet uid latt ngcl^ctdfii of the cVf&K corrosponding ro putative exon* t - 
published s^q^^on6b (Gope at a]„ 1 in paimthesU. 
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therefore, be interwtlng to further scudy the methyl adon 
panum or other raodiflcsrion& of the corresponding gines in 
thfiAe tissue* to see whether they could explain the different 
results on the complexity of tha avidin-encoding nucleotide 
sequences in the genomic DMA. 

The presence of the avf geoe.s in two partially overlap- 
ping genomic clones Indicates that they li« la the name DNA 
cluster. The exon-introo junction* of the avr genes agreed 
well with the genera! *phcc jaccrion ccmseosus uoquences 
(S^aapAthy ex aL 1^90). Moreover, sequences which are in 
accordance with ih& standnrd brunch point conncnsus se- 
quence were found tU expected intron regions, suggesting 
that the putative transcript* 'could be properly processed. The 
preience of a polyadenylation signal (AATAAA) in all genes 
suggest thai the correct modification of the V ends could 
occur. PinalJy. no internal stop codoni wens detected in the 
putative exon sequcnceR. Altogether, this ia consistent with 
the concept that the avr genes are not pseudogeces and could 
be transcribed tmo mature mRNAs. Jn fact, ve have shown 
that tit least two of these genes, avr2 and avri. arc tran- 
scribed fit a low level in the chicken oviduct and intestine, 
respac lively, atfrex intraperitoneal E. eeli infection (Kuunas ct 
al., 1993). In edditioai avr3 iranscripts hav« also been de- 
tected in a chicken macrophage HO-11 ecu line (LappA- 
lainen, P. J.. Kunnaa r T, A., Punnonen. B.-L. and Kuloniiia, 
M. 3.. unpobiished rosultst). 

- Exon 1. encoding the entire signal peptide of the putative 
avidm-related proteins (AVR) and the first three amino acids 
of the mature pro nans, was the most fully conserved region 
when tiie avr g«acs were compared with the avldin cDNA. 
Its identity with the cDNA was 97-98%. "Exon 2 of all the 



avr geocs had regions of clustered point mutations, and the 
identity of the avr genes with the avidin cDNA was 77- . 
82%. The identity in the third exon was 83—98% and the 
mutalioGS were again clewly clustered. The fourth exon was 
also well conserved ($2-94%). The point mutations in the 
introus were more uniformly distributed than in the exons. 
There wa* fi 6-bp deletion in all avr genet at nucleotide posi- 
tions 531 -536 (esxon 2) and a single gap was observed in 
avr3~avr5 at nucleotide* portion 1011 (intron 2), but no 
other deletions or insertions (Fig. 3). The 5' flanging region 
(up to position -176 from the predicted v transcription-initia- 
tion site) of ovri, <zvr2, avr4 and avr5 was perfectly con- 
served (except for the differeuee at hucieotide posidon 10 in 
Fig. 3), which might predict the expression of the genes, i.e. 
the promoter region has been subject to evolutionary pres* 
nure. 

The 6-bp deletion in exon 2 would indicate a common 
ancestor for the avr genae and might be due to 'sUppeoV 
fitrand* mispalriug bused on. the iturrouhdins repeat-like re- 
gion containing only cytosine and adenine residues. The dis- 
tribution of point mutations boiween the avr genea was mark- 
edly similar, indicating o relatively recent duplication of the 
ancestral gene. Moreover, avr} and avr2 have many common 
nucleolides at the sites, where they differ from avr4 and cvr5, 
which, as mentioned above, were identical except for the 
guarune/adenine transitioR (nucleotide at position 10 in 
Fig, 3). This would indittile an Initial diversion between two 
genes, nne being the ancestor fur avtl and avr2. The ancestor 
for avr4 and a\tr$. and possibly tor avrS, as indicated by a 
dOTdrogram (data not shown) produced by the program 
PILHUP of the GCG package (Devercux et al., 19S4), has 
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sequence. 



supposedly duplicated more recently than the ancestor for 
avrl and cvrZ 

Baxcd on biochemical studies, certain amino odd resi- 
dues are known to be important for the function of avidin 
and the bacterial biotin- binding protein strep [avidin (GkKn 
seal,. 1987; Gitbn et a)., 1988a: Gitlin et al., 1953b; Kuczban 
ct al„ 1989; Weber ct aL, 1989; Gitlin « aU 1990). Solving 
the three-dSTneusional guncrure of avidin and its functional 
complex with biotin repealed the amino acid r&siduc* reapon- 
sible for biotin binding (Uvnah et a]„ 1993; PugHaw et 

1993) Of the 17 residues. 12-14 are conserved in all 
AYR species (Fi* 4). HmS8, Ak39 and ThrdO, forming hy- 
drogen bonds with ihe valeryl moiety of biotin, ara substi- 
tuted in the AYR specie*. However, these amino acids are 
also subsdtured in streptavidin, which folds very similarly 
.With avidin (Weber et al. 1989; Pugliese et al., 1093). Out 
of the three lysine residues suggested earlier (Gitlin et al, 
1987), Lysili is moat lively pan of the blodn-binding fiire 
(Pugliese et al., 1993). It is conserved in all AVR species 
except AVR2. The carbohydrate chain of avidin h attached 
to Aspl7, which in the putative products of avr genes was 
subsriuiied by isoleucino. The AYR species, however, con- 
tained 2-4 potential glyco*ylaiiun sites (A^n-Xaa-Thr/ 
^er), filthougb the carbohydrate side chain has been sug- 
gesced to be insignificant for the biotin binding of avidin 
(Hiller et aJ. t 1987). Secoodary-^rructure prediciion (Cliou 
and Faaman. 3978) of the AVR species showed a relatively 



high content of £.*hoet wructure as predicted for avidin and 
strcptavidln (data not shown). Since the 'inflammation avi- 
din' has been detected based on iia abiHty to bind biotin 
and react with a polyclonal avidin antibody (Tuohfraaa et al., 
1989), it ueem» possible that these methods a*e unable to 
distinguish avidin from pmatfve avidin-related proteins Al- 
together, this suggests that the 'inrlamination avidin* could 
be composed of avidin and/or^an) AYR?. 

In conclusion, thia study indicates that besides the avidin 
fcene, tbeie are five structurally related genes avrJ-avrJ in 
chicken. Avidin is induced in & progcsteTone-specific manner 
and in connection with ^nfl animation. Since avr2 and avr3 
transcripts hav« been detected (KunnAS et nl., 1993), it wiii 
be interesting to compare how the transcription of avidin and 
avidin-related genes Is regulated. Thus the avidin gene family 
offers a promising model for the study of the complex mech- 
anisms underlying the reproductive and defense systems. 
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Biotin-binding proteins in eggs of oviparous vertebrates JC J i I ^ S;r;r t cud 

. J.K. Korpela, M.S. Kulomaa, H.A. Ho and P.J. Tuohimaa' ^^Lf SlJ^JS 
Department of Biomedical Sciences, University of Tampere, Box ^/SF^lOj^p^ mfinlgjfa 12 February 1981 

fZZTJ^" b ^ ng Kr aS fomid ^ egg whitcs and y oVks ° raU23 avian species studied, and in a turtle, but the 
Zlr T .considerably even in related species. There was no clear correlation in biotin-binding between egg white 

^t^^x^s^s^ of aWdin * data - t spedes ha - ***** * - *«KS£ 



'. Avian egg white and the egg jelly of the frog contain a 
^specific biotin-binding protein called avidin 2 ^ A biotin- 
-binding protein distinct from avidin has recently been 
^ discovered in the chicken egg yolk 7 - 8 . In contrast to avidin, 
^ e yo ^ bi <>tin-binding protein is saturated with biotin, and 
^a special biotin-exchange assay is therefore required for its 
.determination 7 . No comparative study of the occurrence of 
Jjhe egg white and yolk biotin-binding proteins has as yet 
£een made. We therefore studied these proteins in the egg 
^vhite and yolk in a number of avian and reptilian species, 
in fish hard roe, bull sperm and human seminal plasma. 
^Materials and methods. The eggs of 23 avian species, as 
^hown in the table, were collected in southern Finland 
during the breedingjperiod (April- July). 2 eggs of a turtle 
jTestudo hermanni) were utilized immediately after laying. 
|£he egg white and yolk samples were taken using separate 

Pges to avoid contamination, and diluted with the 
agenization buffer used in the avidin assay 9 . The egg 
^hite and yolk samples, bull sperm and human seminal 
plasma were stored at -20°C until assayed. The hard roe 
j; 0f whitefish (Coregonus albula) or perch ( Perca fluviatilis) 
|#ere assayed fresh. 

; x 6iotin-binding in the egg whites was assayed at room 
^temperature as previously described 9 . Egg whites of 1 
S species in each family were also incubated at 100 °C to 
%tudy the biotin saturation level and to show whether the 

frotein has a heat stability similar to that of chicken 
vidin . The biotin -exchange assay for egg yolks and hard 
ifbes was carried out at 65 °C as described by White et al. 7 
-|hd the ,4 C-biotin-binding reaction in other samples at 
|t>om temperature (21-22 °C). The avidin radioimmunoas- 
tv? W ^ S ut ! lized to stud y in various species the presence of 
j|he antigenic determinants recognized by antiserum against 
|hicken egg white avidin. The lipid material in the egg 
$olks was extracted with 1-butanol 7 ! 

Results and discussion. Biotin-binding was found in all egg 
*%hites and yolks studied (table) including Laws argenta- 



Dilution of egg white 




10 1 10 2 10 3 

Avidin concentration 



10 4 ng/ml 



^placement of 125 I-avidin with chicken and quail egg white. 
J[ftal dilutions of the chicken (□) and quail (A) egg white were 
^ ayed by the radioimmunoassay for chicken avidin. Each point 
resents the mean of 4 determinations. Purified chicken avidin 
#s used to obtain an avidin standard curve (O). 



> tus 5 . The biotin-binding activity varied considerably in the 
egg whites of different avian species even within the same 
family. In the major avian families, elevated temperature 
(100 °C) did not increase the l4 C-biotin-binding in the egg 

Biotin-binding activities in the egg white and yolk in various avian 
and reptilian species 

Species* Number 14 C-biotin bound 
. of eggs (cpm x KP/g)* 1 
"• Egg white Egg yo lk 

Reptiles 

Chelonia 
Testudinidae 
Testudo hermanni 
Birds 

Gaviiformes 
Gaviidae 
Gavia arctica 

Anseriformes 
Anatidae 

A nas platyrkynchos* 
Anas platyrhynchos d 
S ornate ria mollissima 
Melanitta fusca 
Bucephala clangula 

Galliformes 
GaUinae 

Gallus domesticus 
Phasinidae 
Cotumix cotumix 

Gruiformes 
Rallidae 
Fulica atra 

Charadriiformes 
Haematopodidae 
Haematopus ostralegus 
Stereo rariidae 
Stercorarius parasiticus 
Laridae 

Lotus ridibundus % 
Lotus argentatus 
Larus canus 
Sternidae 
Sterna hirundo 

Columbiformes 
Columbidae 
Columbd livia 

Passeriformes 
Turdidae 

Phoenicurus phoenicurus 
Turdus pilaris 
Muscicapidae 
Muscicapa striata 
Ficedula hypoleuca 
raridae 
Parus major 
Corvidae 
Pica pica 
Corvus corone 

■Orders and families are also indicated. b The means ±S EM are 
given. c Domestic form. d Wild form. 



2 


11.4 


35.2 


i 


18 


460 


9 


33±4 


59±2 


7 


180±29 


84±4 


8 


53±7 


48±7 


5 


2I8±29 


209 ±4 


15 


139±9 


134±11 


10 


290±13 


88±2 


9 


513±64 


57±7 


4 


372±44 


130± 15 


1 


77 


59 


1 


165 




15 


37±4 


I32±15 


3 


20±U 


123±11 


2 


57 


183 


5 


158 ±18 


134±18 


2 


57 . 


183 


3 


73 ±1 


156± 13 


4 


26±3 


97±9 


3 


9±0 


136±7 


6 


46±7 


48±4 


9 


317±20 


233±11 


2 


22 


143 


8 


106±15 


154 ±22 



Experientia 37 ( 1981), Birkhauser Verlag, Basel (Schweiz) 



white (not shown), which suggests that avidin is essentially 
biotin-free. On the other hand, heating decreased biotin- 
binding values in some avian species, suggesting a smaller 
stability of their avidin to heat than that in the chicken. 
The quail egg white showed an incomplete cross-reaction in 
the radioimmunoassay for chicken avidin (fig.), while the 
egg whites of other avian species could not prevent ,25 I- 
labelled avidin from binding to antiserum. This result 
indicates differences in the antigenic determinants of avidin 
molecules as compared to chicken avidin. 
The biotin-binding protein found in the chicken egg 
yolk • was demonstrated here to be a common constituent 
in the egg yolk of various avian species (table). This protein 
is distinct from avidin, since it is denaturated at 100 0 C 7,10 . 
The biotin-binding activities in the egg yolks also varied 
considerably from species to species. No clear correlation 
was found between the biotin-binding activities in the egg 
white and yolk in different species. The lipid-free yolk 
material in all avian species studied did not show any cross- 
reaction in the avidin radioimmunoassay. 
The egg white and yolk of the turtle also showed biotin- 
binding activity (table), but no cross-reaction in the avidin 
radioimmunoassay. No biotin-binding activity was found in 
the hard roe of the fishes, bull sperm or human seminal 
plasma. Jones and Briggs 5 discovered a low biotin-binding 
activity in fresh bull sperm. This discrepancy in results 
might originate in the microbiological avidin assay they 
used, since the growth of microbes could be inhibited by 
any growth inhibitor present in the bull sperm. 
It has been proposed that the biotin-binding proteins might 
be widely distributed in the animal kingdom and play some 
fundamental role in the physiology of reproduction 3 " 5 . An 
antimicrobial 612 effect for avidin, and a biotin-transporting 
role for yolk biotin-binding protein, have been suggested 
as their functions. Fishes and mammals 1316 so far studied 
did not contain any biotin-binding protein similar to that 



found in egg whites and yolks. It seems possible that special 
biotin-binding proteins have evolved for reproductive pur- 
poses in amphibian, reptilian and avian eggs. 



1 We thank Mr Jukka Pettonen and Mr Antti Kariin for the 
collection of the avian eggs with permission obtained from the 
Ministry of Agriculture, and Mr Reino Saarinen for the turtle 
eggs. The authors are indebted to Mrs Ouu" Kurronen, Miss 
Riitta Mero and Miss Tiina-Maija Mattila for technical assis- 
tance. This work was supported by the Ford Foundation 
Grants No. 760-0526 and No. 790-0665. 

2 R.E. Eakin, E.E. Snell and R.J. Williams, J. biol. Chem. 140 
535 (1941). 

3 R. Hertz and W. H. Sebrell, Science 96, 257 (1942). 

4 R.E. Feeney, J.S. Anderson, P.R. Azari, N. Bennett and M.B. 
Rhodes, J. biol. Chem. 235, 2307 (1960). 

5 P.D. Jones and M.H. Briggs, Life Sci. //, 621 (1962). j^^y 

6 N. M. Green, Adv. Protein Chem. 29, 85 ( 1975). r^*^— 

7 H.B. White HI, B.A: Dennison, M.A. Delia Fera, C.J. Whit- 
ney, J.C. McGuire, H.W. Meslar and P.H. Sammelwitz. 
Biochem. J. 157, 395 (1976). 

8 H.W. Meslar, S.A. Camper and H.B. White III, J. biol. Chem 
253, 6979(1978). 

9 M.S. Kulomaa, H. A. Elo and P.J. Tuohimaa, Biochem. J. 175 
685(1978). 

10 M.S. Kulomaa, H A. Elo, A.O. Niemela and P.J. Tuohimaa. 
Biochim. biophys. Acta 670, in press (1981). 

11 R.D. MandeUa, H.W. Meslar and H.B. White III, Biochem. J 
175, 629(1978). 

12 H.A. Elo, S. Raisanen and P.J. Tuohimaa, Experientia 36, 312 
(1980). 

13 R. Hertz, Physiol. Rev. 26, 479 (1946). 

14 H.A. Elo, M.S. Kulomaa and P.J. Tuohimaa, Comp. Bio- 
chem. Physiol. 62 B. 237 (1979). 

15 H.A. Elo, Comp. Biochem. Physiol. 67 B, 221 (1980). 

16 P. Tuohimaa, M. Kulomaa, A. Niemela, T. Torkkeli, O. Janne 
and S.J. Segal, Proc. natl Acad. Sci. USA, submitted for 
publication. 



A low molecular weight tracer molecule for immunocytochemistry. Identification of cytoplasmic actin 

R. Tiggemann and M. V. Govindan 

Swi^ Konstanz (Fedeml Re P ublic Germany), and German Cancer Research 

Centre, Im Neuenheimer Feld 280, D-6900 Heidelberg (Federal Republic of Germany), 11 February 1981 

mo3n^ n ^ Ctin ^-fragments were tagged to a small electron dense tracer molecule; ferrocene monocarboxylic acid 
(230 daltons). The conjugate stains actin filaments, which were found mainly in the core of microvilli. aroo * v " c dua 



Many technical efforts have been made to visualize anti- 
genic structures immunocytochemically. The main obstacle 
has been that the methods depend upon very large tracer 
molecules such as ferritin^ The immunoperoxidase 
method does not eliminate this problem either, as the 
enzyme-antibody complex is still too large in diameter to 
pass through the cell membrane. Attempts were also made 
to allow large molecules to penetrate the plasma membrane 
with membrane disrupting agents 3 * 4 , or with enzymatic 
digestion of certain membrane components 5 ; however, 
these manipulations resulted in the destruction of the cell 
shape. Thus most techniques are still far from being 
established for immunocytochemistry. We here present a 
more suitable staining procedure, which helps to avoid 
most of the difficulties mentioned above. The very small 
Fab-fragment - ferrocene carboxylic acid (FMCA) com- 
plex identifies intracellular antigens without background 



effects. The procedure is easy to handle, direct and does not 
result in the destruction of cellular ultrastructure. 
Actin was isolated from Ehrlich mouse ascites tumour 
(MAT) cells according to Lazarides and Weber 6 , purified 
by polymerization- depolymerization cycles 7 and by 
DNase-I affinity chromatography 8 . Actin was injected s.c. 
into male New Zealand rabbits in the presence of complete 
Freund*s adjuvant (protein concentration: 1.5 mg/ml). This 
was repeated on days 8 and 40 after the 1st inoculation. The 
IgG fraction was isolated from the serum and purified by 
DEAE-52 ion exchange chromatography 9 . Fab-fragments 
were prepared according to Porter' 0 and labeled with the 
iron-containing sandwich molecule FMCA, using a water 
soluble carbodiimide 11 . Fab-fragments (5 mg protein), 
FMCA (5 mg) and Uethyl-3 (3-dimethylaminopropyl)- 
carbodumide (10 mg) were dissolved in 2.5 ml 10 mM 
sodium phosphate and gently stirred at 4 °C over night. The 
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Abstract 

A prerequisite for the use of recombinant antibody technologies starting from hybridomas or immune repertoires is the 
reliable cloning of functional immunoglobulin genes. For this purpose, a standard phage display system was optimized for 
robustness, vector stability, tight control of scFv- A genelD expression, primer usage for PCR amplification of variable region 
genes, scFv assembly strategy and subsequent directional cloning using a single rare cutting restriction enzyme. This 
integrated cloning, screening and selection system allowed us to rapidly obtain antigen binding scFvs derived from 
spleen-cell repertoires of mice immunized with ampicillin as well as from all hybridoma cell lines tested to date. As 
representative examples, cloning of monoclonal antibodies against a his tag, leucine zippers, the tumor marker EGP-2 and 
the insecticide DDT is presented. Several hybridomas whose genes could not be cloned in previous experimental setups, but 
were successfully obtained with the present system, expressed high amounts of aberrant heavy and light chain mRNAs, 
which were amplified by PCR and greatly exceeded the amount of binding antibody sequences. These contaminating 
variable region genes were successfully eliminated by employing the optimized phage display system, thus avoiding time 
consuming sequencing of non-binding scFv genes. To maximize soluble expression of functional scFvs subsequent to 
cloning, a compatible vector series to simplify modification, detection, multimerization and rapid purification of recombinant 
antibody fragments was constructed. 

Keywords: Phage display; Single-chain Fv; Monoclonal antibody; Antibody library 



Abbreviations: BSA, bovine serum albumin; cam, chloramphenicol; CDR, complementarity determining region; cfu, colony forming 
units; DDT, 1 J-bis(p-chlorophenyl)-2,2,2-trichloroethane; EGP-2, epithelial glycoprotein-2; ELISA, enzyme linked immunosorbent assay; 
EMCS, ^^maleimidcHcaproxyloxy)succimmide; FR, framework; gillp, wild-type genelll protein of filamentous phage; IMAC, immobi- 
lized metal affinity chromatography; IPTG, isopropylthiogalactoside; LZ, leucine zipper; nt, nucleotide; OD, optical density; PBS, 
phosphate buffered saline; pelB, pectate lyase gene of Erwinia carotovora; PCR, polymerase chain reaction; scFv, single-chain Fv 
fragment; SD, Shine-Dalgarno sequence; SDS, sodium dodecylsulfate; SDT7glO, Shine-Dalgarno sequence of T7 phage genelO; SOE-PCR, 
splicing by overlap extension PCR; tet, tetracycline; V H , heavy chain variable domain; V L , Ught chain variable domain. 
* Corresponding author. Tel.: (+41-1) 257-5570; Fax: ( + 41-1) 257-5712; e-mail: plueckmun® biocfebs.unizh.cn 
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1. Introduction 

Molecular cloning and sequencing of antibody 
variable domains forms the basis of antibody mod- 
elling (Rees et al., 1994), antibody engineering 
(Pluckthun, 1994; Nilsson, 1995) and experimental 
structure determination by NMR (Freund et al., 1994) 
or X-ray crystallography at high resolution 
(Ostermeier et al., 1995). Moreover, once the vari- 
able region genes have been cloned, the antibody 
domains can be further engineered in a multitude of 
ways to produce antibody variants with lower im- 
munogenicity (Giissow and Seemann, 1991), higher 
affinity (Marks et al., 1992; Riechmann and Weill, 
1993; Deng et al., 1994), altered antigenic specificity 
(Ohlin et al., 1996), or enhanced stability 
(Glockshuber et al., 1990; Reiter et al., 1994). Fur- 
thermore, genetic fusions of scFv fragments to effec- 
tor proteins and toxins are powerful tools in the 
fields of medicine and diagnostics (Huston et al., 
1993). 

In all application areas, the demand for efficient 
generation of functional antibody fragments in- 
creases continuously. Although large prefabricated 
antibody libraries are gradually becoming a source of 
recombinant antibody fragments that cover a wide 
range of useful affinities (Vaughan et al., 1996), it 
may still be necessary to use the diversity of the 
immune system to create the most extensive panel of 
different antibodies against a given target possible. 
Furthermore, it is often of great interest and impor- 
tance to clone V H and V L domains of the natural 
antibody response to a given antigen. In cases in 
which a large amount of experimental or clinical 
data is available on a given monoclonal antibody 
(mAb), it is frequently useful to base new constructs 
on this work and to determine its specific sequence 
and binding mode. Cloning and sequencing retains 
and immortalizes the unique and extensively charac- 
terized specificity of mAbs, which can be crucial for 
the rescue of unstable hybridoma cell lines. 

One major problem in rapidly and simply obtain- 
ing sequence information about mAbs stems from 
the occurrence of aberrant mRNAs which are tran- 
scribed from rearranged, but non-functional, heavy 
and light chain genes in the hybridoma (Cabilly and 
Riggs, 1985; Strohal et al., 1987; Carroll et al., 1988; 
Kaluza et al., 1992; Kiitemeier et al., 1992; Nicholls 



et al., 1993; Duan and Pomerantz, 1994; Yamanaka 
et al., 1995; Ostermeier and Michel, 1996). These 
non-productive chains are frequently preferentially 
amplified over the productive ones by sets of primers 
specific for the variable regions of antibody genes. 
The aberrant chains may greatly dilute the desired 
antibody sequences, which are the only ones binding 
the antigen in a pool of non-productive antibody-like 
sequences. Several attempts have been reported to 
overcome this problem, such as ribozyme cleavage 
of a known aberrant k chain sequence (Duan and 
Pomerantz, 1994), treatment of aberrant 
mRNA/DNA hybrids with RNAseH (Ostermeier and 
Michel, 1996), or functional screening for full length 
scFv products in an in vitro transcription/translation 
system (Nicholls et al., 1993). Each of these methods 
is time consuming, depends on prior sequence infor- 
mation of the contaminating gene and fails to enrich 
binding molecules by selection procedures. Since 
antibody genes are usually amplified by PCR using 
degenerate sets of primers, mismatches and PCR 
errors will lead to point mutations or out-of-frame 
clones, which can also contribute to a background of 
non-functional scFv molecules. Therefore, it is abso- 
lutely vital, but often neglected (Miller et al., 1995; 
Kwak et al., 1996), that the binding specificity of the 
recombinant antibody sequence is demonstrated to 
be comparable with the binding characteristics of the 
parental monoclonal antibody, even when the de- 
duced antibody sequence seems reasonable. 

The inherent advantage of phage display is its 
direct link of DNA sequence to protein function 
(McCafferty et al., 1990; Winter et al., 1994). Thus, 
single clones can be rapidly screened for antigen 
binding and, even more importantly, selected from 
pools in the same experimental setup. This obviates 
the use of sequence specific methods to eliminate 
undesired sequences and leads to a more generally 
applicable procedure for hybridoma cloning. 

However, phage display suffers from the fact that 
non-productive, aberrant chains are often very well 
expressed and non-toxic to the bacterial cell, whereas 
cells expressing functional scFv-genelH fusions have 
a growth disadvantage and are selected against. The 
scFv-genelH fusion protein can cause vector instabil- 
ity, creating deletions in the antibody fusion genes as 
occasionally observed (Courtney et al., 1995; 
Dziegiel et al., 1995; A. Krebber, unpublished obser- 
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vations; footnote 1). Thus, it is highly recommended 
to use a regulatable vector system allowing tight 
product suppression during all propagation steps as 
well as controlled expression of low amounts of 
scFv-genelQ fusion protein for phage display. Since 
a variety of serious technical problems concerning 
hybridoma cloning and enrichment of binding anti- 
body fragments from phage display libraries have 
been reported *, we have developed the reengineered 
phage display system described in this work. In order 
to provide a robust and straightforward methodology 
which ensures fast and reliable cloning, not only of 
hybridomas but also of larger antibody libraries, each 
step in the process was optimized. To illustrate the 
utility of our improved phage display system we 
report in detail several case studies of successfully 
cloned scFvs derived from monoclonal antibodies as 
well as enrichment of binding scFv sequences from 
cloned B cell repertoires. 



2. Materials and methods 

2.1. Isotyping 

Isotypes of the mAbs were determined using the 
IsoStrip mouse monoclonal antibody isotyping kit 
(Boehringer Mannheim). 

2.2. Preparation of mRNA 

mRNA was extracted from 1-5 X 10 6 hybridoma 
or spleen cells using the QuickPrep mRNA purifica- 
tion kit from Pharmacia. In the case of hybridoma 
cell lines 13 AD and 42PF total RNA was isolated 
essentially as described by Berger and Chirgwin 
(1989). 

2.3. First strand cDNA synthesis 

About 1 fig mRNA or 5 /xg total RNA was 
reverse transcribed in a reaction volume of 33 ^tl 
using random hexamer primers according to the 



1 For typical examples, see the internet discussion forums, 
http://www.bio.net/hypermail/METHDS-REAGNTS, 
http://www.bio.net/hyperrnail/MOLECULAR-REPERTOIRES. 



manufacturer's protocol (first strand cDNA synthesis 
kit (Pharmacia)). 

2.4. PCR amplification of V L and V H 

Various DNA polymerases (Taq (Perkin Elmer, 
Gibco), Pwo (Boehringer Mannheim), Pfu (Strata- 
gene), Vent (New England Biolabs)) were success- 
fully used for separate amplifications of V L and V H . 
For amplification of V L from hybridomas either A or 
k primers were chosen according to the isotype. 
PCR reactions were performed in 50-100 fjul vol- 
umes, containing 2-5 /xl of cDNA reaction, 2 jjM 
of LB and LF primer mixes (Table 1, Fig. IB) for 
amplification of V L or 2 /jlM of HB and HF primer 
mixes (Table 1, Fig. IB) for amplification of V H , 
200 jLtM dNTPs, an optimized Mg 2+ concentration 
(2-6 mM) and reaction buffer supplied by the manu- 
facturers. After 3 min denaturation at 92°C, 2 U of 
DNA polymerase were added, followed by 7 cycles 
of 1 min at 92°C, 30 s at 63°C, 50 s at 58°C, 1 min at 
72°C, and 23 cycles of 1 min at 92°C, 30 s at 63°C, 1 
min at 72°C. One tenth of each PCR reaction was 
analyzed by agarose gel electrophoresis (Fig. 2). 

2.5. Assembly PCR 

The full length PCR products of V L and V H were 
purified by preparative agarose gel electrophoresis in 
combination with the QIAEX (Qiagen) or Jetsorb 
(Genomed) DNA extraction kit. Approximately 10 
ng of each V L and V H DNA were combined by 
SOE-PCR (Fig. 1C; Ge et al., 1995). An initial 
denaturation step (3 min, 92°C) was followed by 2 
cycles of 1 min at 92°C, 30 s at 63°C, 50 s at 58°C, 1 
min at 72°C in the absence of primers. After adding 
the outer primers scback and scfor (Table 1, Fig. 1C; 
each 1 jaM), 5 cycles of 1 min at 92°C, 30 s at 63°C, 
50 s at 58°C, 1 min at 72°C, and 23 cycles of 1 min 
at 92°C, 30 s at 63°C, 1 min at 72°C were performed. 
One tenth of each PCR reaction was analyzed by 
agarose gel electrophoresis (Fig. 2). 

2.6. Sfil digest and cloning of scFv fragments into 
pAKWO 

The gel-purified scFv fragment and the phage 
display vector pAKlOO were both digested with Sfil 
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for 3 to 4 h at 50°C (Fig. ID). After purification, the 
scFv fragment was ligated into the vector (molar 
ratio vector to insert 1.5 : 1) and transformed into E. 
coli XLl-Blue (Stratagene). For library construction, 
ligation mixtures precipitated with n-butanol 



(Thomas, 1994) were electroporated into XL1 Blue 
(Dower et al., 1988; yield approximately 5 X 10 7 
clones per /x,g Sfil-cut insert DNA). After plating on 
NE medium (non expression medium: 2 X YT con- 
taining 1% glucose and 25 Atg/ml chloramphenicol) 



Table 1 

Listing of the primers used for assembling mouse scFv fragments in the orientation VL-(G 4 S) 4 -VH, which are compatible with the pAK 
vector system presented in Fig. 1 and Fig. 4 



Primer VLbacK; 



LBl 

LB2 

LB3 

LB4 

LB5 

LB 6 

LB7 

LBS 

LB9 

LB10 

LB11 

LB 12 

LB 13 

LB14 

LBl 5 

LB16 

LB17 

LBX 



5' FLAG VL3 

t tac tcgcflgggcagccgqccatqac qgactacaaflG 

5' FLAG VL 

gccatggcgt 
gccatggcgj 
gccatggcgt 
gccatggcgt 
gccatggcgt 
gccatggcgt 
gccatggcgt 
gccatggcgt 
gccatggcg* 
gccatggcgt 
gccatggcgs 
gccatggcgj 
gccatggcgt 




tTCCAGCTGACTCAGCC 
lTTGTTCTC^CCCAGTC 
LTTGTGtfTMACTCAGTC 
lTTGTGXTBACACAGTC 
.TTGTEATGACMCAGTC 
lTTMAGA'?RAMCCAGTC 
lTTCAGATGAXDCAGTC 
kTXCAGATGACACAGAC 
LTTGTTCTCASSCCAGTC 
lTTGKGCT£ACCCAATC 
tTTSTEATGACCCABTC 
LTGACCCAEAC 
iTTGTGATGACfiCAGEC 



gccatggc g qactacaaaGAY ATTGTGATAACYfAnGA 
gccatggcgg^cioc^aAG^YATTGTGATGACCCAGHT 
gccatggcgfla^tac^aafiarATTGTGATGACACAACC 
gccatggcgqac£ac^aa£A^TTTTGCTGACTCAGTC 
gccatggc gg^cj^c^aaS^TGCTGTTGTGACTCAGGAATC 



Primer VL for: 

5* (Gly4Ser)3-linker VL 3 1 

ggagccgccgccgcc (agaaccaccaccacc) 2ACGTTTGATTTCCAGCTTGG 



LF1 
LF2 
LF4 
LF5 
LFX 



ggagccgccgccgcc (agaaccaccaccacc) 2ACGTTTTATTTCCAGCTTGG 
ggagccgccgccgcc (agaaccaccaccacc) 2 ACGTTTTATTTCC AACTTTG 
ggagccgccgccgcc (agaaccaccaccacc) 2 ACGTTTCAGCTCCAGCTTGG 
ggagccgccgccgcc (agaaccaccaccacc) 2 ACCT AGGACAGTC AGTTTGG 



8 
8 
16 
12 



16 
16 
12 
4 
4 
2 
2 
1 



111 Mix 
1 

2 
5 

3.5 
4 
7 
6 

1.5 
2 

3.5 

8 

8 

6 

2 

2 

1 

1 

1 



1 
1 
1 
1 

0.25 



Primer VHhacfr 

HBl 
HB2 
HB3 
HB4 
HB5 
HB6 
HB7 
HB8 
HB9 
HB10 
HB11 
HBl 2 
HBl 3 
HBl 4 
HBl 5 
HBl 6 
HBl 7 
HBl 8 
HBl 9 



5' (Gly4Ser) 2- linker BaraHI VH 3 ' d \i\ Mix 

S^g^ggcg^tccggtggtggtgaatCiGAKGTEMAGCTTCAGGAGTC 8 4 

ggcgg^ggCTOetccaqtqqtaqt aaatcc GAGQTBpAr^pn&ry'&ryTV' 9 4 

racggcggcgqctccggtqqtqqt ggatcq CAGGTGgAGOTaAAGaASTr 4 3 

ggcg^ ggcOTctccggtggtggtqaatccGAflGTrrARfTTY^&At-aBTn 4 4 

gucgycuuuuuutecqqtgqtqqt qaat;cc caGC?r^CAraTrgranr'aB7r' 12 7 

ggcggcggcggotccggtggtggtgaat^CAGGTXCABCTGCAGCAGTC 4 2 

ggcggtjggcggctccggtqqtqq tqqatccC AGGTCCACGTnAAnrArypr 1 1 

ggcggcggcggcteeggtqqtqqt qqatccG AGGTGAAgSTGGTGGAATg 4 2 

ggt^gcggcggctccgqtqgtgqtg g a fc ccGAVGTGAijGYTcyriY^ 12 5 

ggcggcggcggctccggtggtggtg^a^^AGGTGCAG^GGTGGAGTC 4 2 

ffffCggcgigcggctccggtggtggtq^ai^GAKGTGCAMCTGGTGGAGTC 4 2 

ggcggcggcggtitccggtggtggtgs^JtccGAGGTGAAGCTGATGGABTC 2 2 

Ogegqcggcgqctccgqtqqtgqfc qqatcc GAGGTC^AfirT'Tryi^arriY' 2 1 

ggcggcOTcggctccgqtqqtqqt qqatcc GAi^ivvAGrrrTrTr^arrrr 4 2 

ggcggcgqcggctccgqtqqtqqt qqatcq GAAGTGAARSTTGAGGAQTC 4 2 

ggcg^eggcggctecggtggtggtgaaiCiCAGGTTACTCTEAAAGJJGTSTG 8 5 

ggcggcggeggctccggtggtggtgaai^CAGGTCCAACTVCAGCABCC 6 3.5 

ggcggcmcggctCCgqtqqtqqt qqatcq CATGTGAACTTnnftAri'Ty^'Pf' -- 1 0.7 

gjcggcuycguctccggtqqtqqt qqatcc GAGGTGAAGGTrrATrrtartTr \ q.7 



Primer V H for 
scfor 

HF1 
HF2 
HF3 
HF4 



5 ' EcoRI 3 ' 

ggaat tcggcccccgag 

5 * EcoRI VH 3 • 

ggaa 1 1 cggc cccca agg cCG AGGAAACGgtg AfynTfyyp \ \ 

ggaattcg^^cccgaggc^GAGGAGACTGTGAGAGTGGT 1 1 

g g aa 1 1 cggcgcc c g aa2££GC AGAG ACAGTGACCAGAGT 1 1 

ggaattcg^^cccqa qqcCG AGGAGACGGTC^CTriAnrrp 1 1 
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in 530 cm 2 dishes (Nunc) and overnight incubation 
at RT, the colonies were scraped off the plates into 8 
ml 2 X YT (Sambrook et al., 1989) and subsequently 
stored at -80°C after addition of 10% glycerol. 

2.7. Rescue ofscFv displaying phages 

To rescue scFv displaying phages, 50 ml NE 
medium was inoculated with approximately 10 9 cells 
from the glycerol library stock. The culture was then 
shaken at 37°C. At OD 550 = 0.5, 10 11 cfu helper 
phage VCS (Stratagene) and 25 /d 1 M IPTG solu- 
tion were added. After 15 min incubation at 37°C 
without agitation, the culture was diluted in 100 ml 
LE medium (low expression medium: 2 X YT con- 
taining 1% glucose, 25 pg/m\ chloramphenicol and 
0.5 mM IPTG). The culture was then shaken for 10 h 
at 26°C for phage production. 2 h after infection 30 
pg/ml kanamycin was added. Phage particles were 
purified and concentrated 100-fold by two 
PEG/NaCl precipitations (Sambrook et al., 1989), 
resuspended in PB S and stored at 4°C . After 
overnight culture typically a phage titer in the range 
of 10 11 cfu/ml was observed. 

2.8. Selection of antigen binders by panning 

For selection, immunotubes (Nunc, Maxisorp) 
were coated overnight at 4°C with 4 ml of 10-100 
/xg/ml antigen solution (for the anti-ampicillin li- 
braries: 100 yxg/ml transferrin-EMCS-ampicillin in 



PBSU (1 : 1 mixture of PBS (20 mM NaPj, 150 mM 
NaCl, pH 7.2) and urea-NaPj buffer (8 mM urea, 50 
mM NaPj, pH 7.0)); for anti-leucine zipper mAb 
13AD: 10 ^tg/ml biotin-LZ^/streptavidin complex 
(Leder et al., 1995) in 34.8 mM NaHC0 3 , 15 mM 
Na 2 C0 3 , 0.02% NaN 3 , pH 9.6; for the anti-EGP-2 
mAb MOC31: 100 Mg/ml EGP-2 in PBS). After 
blocking with 4% dried skimmed milk powder in 
PBS for 2 h at room temperature (RT), 10 11 phagemid 
particles in 4 ml PBS containing 2% milk were 
added and incubated for 2 h with rocking at RT. 
Tubes were then washed 15 times with PBS/0.1% 
Tween and 15 times with PBS. Bound phages were 
eluted from the tube with soluble antigen (1 ml 10 
mM ampicillin in PBS; anti-ampicillin libraries) or 
800 pA 0. 1 M glycine/HCl pH 2.2 for 10 min. The 
latter solution was neutralized with 48 p\ 2 M Tris 
and the phages (typically 10 4 -10 6 cfu/ml) were 
used for reinfection (30 min at 37°C without agita- 
tion) of E. coli XL 1 -Blue cells in NE medium 
(OD 550 = 0.5-0.8). In the case of elution with ampi- 
cillin, the solution was treated with 2 units of /3- 
lactamase for 15 min before reinfection. This subli- 
brary was rescued as described above and subjected 
to further panning rounds or binding analysis by 
phage ELISA. 

2.9. Phage ELISA 

Single colonies were grown separately in 2 ml NE 
medium at 37°C. After reaching OD 550 = 0.3, 1 ml 



Table 1 (continued). In this nomenclature, 'back* refers to Howard the 3' end of the antibody gene' and 'for* to 'toward the 5' end of the 
antibody gene*. The sequences are given using the IUPAC nomenclature of mixed bases (shown in underlined capital letters, R = A or G; 
Y = C or T; M = A or C; K = G or T; S = C or G; W = A or T; H = A or C or T; B = C or G or T; V = A or C or G; D = A or G or T), 
with a column listing the d-fold degeneration encoded in each primer and the /U to be used to set up the PCR mix. The LB 1 -LB 17 series 
encodes a stretch of 20 bases hybridizing to the mature mouse antibody k sequences (in capital letters). Underlined is the preceding 
sequence which encodes the shortened FLAG sequence (Knappik and Pliickthun, 1994). Since the FLAG tag uses the fixed N-terminal 
aspartate of the mature antibody (encoded by GAY), only three additional amino acids are necessary. The FLAG codons are then preceded 
by the codons specifying the end of the pelB signal sequence. The LB A primer for mouse lambda chains is constructed in an analogous 
manner (the N-terminal glutamate of the mature mouse A sequence is replaced by aspartate (encoded by GAT) to generate a FLAG tag). 
The VLfor primer sequences are complementary to the J elements of k or A chains (capital letters) and encode three repeats of the Gly 4 Ser 
sequence, the terminal one (bold) of which has a different codon usage so that incorrect overlaps during the PCR assembly reaction are 
minimized. The VHback primers encode the other part of the linker as well as a BamHI recognition site (underlined), and the overlap with 
VLfor in the sequence shown in bold. The 20 bases given in capital letters hybridize with the mature mouse V H sequences. The last 19 nt at 
the 3' end of the VHfor primers hybridize with the J H region. The first nt shown in capital letters will introduce a silent mutation at the end 
of V H in order to code for the first nt of the second Sfil recognition site (underlined). The final assembly of the scFv gene by SOE-PCR is 
carried out with the scback and scfor primer set. The outer primer scback encodes the first Sfil site (underlined). All primers contain a 
phosphorothioate at the 3' end, with the exception of scfor, to avoid potential interference with Sfil digestion. 
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NE medium complemented with 5 X 10 9 cfii VCS 
helper phage (Stratagene) and 1.5 mM IPTG was 
added. The cultures were allowed to produce scFv 
displaying phages during overnight incubation at 
24°C. Phages from 1.6 ml culture supernatants were 
PEG precipitated and dissolved in 200 yxl PBS. For 
anti-leucine zipper hybridomas 13 AD and 42PF, scFv 
displaying phages were produced and concentrated 
100-fold by PEG precipitation as described for the 
phage panning experiments. 10-100 /xg/ml antigen 
(for the anti-his tag hybridoma 3D5: 100 Mg/ml 



his-tagged citrate synthase in PBS; for the anti- 
leucine zipper hybridoma 42PF: 10 ftg/ml BSA- 
LZ(7P14P) in 34.8 mM NaHC0 3 , 15 mM Na 2 C0 3 , 
0.02% NaN 3 , pH 9.6; for the anti-DDT hybridoma 
3D7: 70 /-eg/ml )3-alaninamide-DDT conjugated to 
lysozyme; for the anti-ampicillin library and the 
hybridomas 13 AD and MOC31: coating solutions as 
described for the panning procedure above) was 
coated onto NUNC plastic ELISA plates by overnight 
incubation at 4°C. 50 fil phage solution per well was 
mixed with 50 4% dried skimmed milk powder in 
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PBS in the presence or absence of competing soluble 
antigen and incubated at RT for 10 min. After wash- 
ing and blocking (blocking buffer: 4% dried skimmed 
milk powder in PBS) of the ELISA wells, the phage 
solution was added and incubated for 1 h at RT. 
After washing, 100 pi of 1 /5000 diluted HRP/anti 
M13-conjugate (Pharmacia) in blocking buffer was 
added and incubated for 1 h at RT. For detection, 
100 fxl soluble BM blue POD-substrate (Boehringer 
Mannheim) was used. 

2.10. Soluble expression of scFv fragments in 
pAK300 and pAK400 

For soluble expression, 20 ml expression medium 
(2 X YT containing 25 ^tg/ml chloramphenicol) was 
inoculated with 200 jjlI of preculture (JM83 harbor- 
ing the expression plasmid for the anti-ampicillin 
scFv antibody aL2, pAK300scFvaL2 or 
pAK400scFvaL2, respectively; grown overnight at 
RT in NE medium) and incubated in a shaking 
waterbath at 24°C. Expression was induced at OD^q 
= 0.5 with 1 mM IPTG and allowed to proceed for 4 
h at 24°C. To monitor total scFv production, an 
aliquot of the culture was diluted in PBS to an 
OD 600 = 1 and mixed with 5 X reducing SDS-P AGE 
sample buffer (Sambrook et al., 1989). The rest of 
the culture was adjusted to OD 600 = 5 by centrifuga- 



tion and resuspension in 2 ml PBS. The cells were 
disrupted three times by French press and separated 
into soluble and insoluble fractions by centrifugation 
(soluble fraction = supernatant; insoluble fraction = 
pellet vortexed in the same buffer volume). Aliquots 
of both fractions were mixed with 5 X reducing 
SDS-PAGE sample buffer. A 12.5 ^1 aliquot of each 
sample was used for 0.1% SDS-12% PAGE and 
subsequent Western blot detection. 

2.11. Western blot 

Gels were blotted onto PVDF membranes using 
standard protocols. The scFv fragments were de- 
tected using the anti-FLAG mAb Ml (Kodak), fol- 
lowed by an anti-mouse IgG peroxidase conjugate, 
essentially as described in Knappik and Pluckthun 
(1994). 

2.12. Sequencing 

Nucleic acid sequences were determined either by 
manual Sanger dideoxy sequencing (USB Sequenase 
kit), or by cycle sequencing (Sequi Therm Long-Read 
Cycle Sequencing Kit-LC, Epicentre Technologies) 
with fluorescent primers using a DNA sequencer 
(LI-COR). 



Fig. 1. Scheme of amplification and cloning procedure. A: cDNA synthesis. mRNA derived from spleen cells or hybridomas and random 
hexamer primer (pd(N) 6 ) or subclass specific primers (not shown) are used for cDNA synthesis. B: PCR amplification of V L and V H 
domains. The cDNA is used as the PCR template for the amplification of V L and V H domains by the primer mixes indicated (listed in Table 
1). C: assembly SOE-PCR. V L and V H PCR products are first assembled into the scFv format (splicing by overlap extension) without 
primers and subsequently amplified by the outer primer pair scback and scfor. D: Sfil digestion of the amplified scFv fragment The rare 
cutting enzyme Sfil is the only enzyme used for antibody cloning. £: ligation of Sfil digested pAKlOO vector and insert. Note that 
directional cloning of the Sfil inserts is guaranteed because of the different Sfil sticky ends shown. In addition, self-ligation of insert or 
vector molecules is excluded by the asymmetry of the overhang. The phage display vector pAKlOO contains a tet resistance cassette (tetA 
and tetR; 2101 bp) to facilitate monitoring of complete Sfil digestion by gel electrophoresis and by religating and subsequent plating on tet 
plates, the lad repressor gene, a strong upstream terminator (t HP ), the lac promoter/ operator and the pelB leader sequence, which has 
been modified to contain an Sfil site (for details see Fig. 3A). After ligation, the antibody fragment is fused in frame to genel 11250-406 
(Fig. 3B). The in-frame fusion contains a myc-tag (Munro and Pelham, 1986) to act as a detection handle, in addition to the short 3-amino 
acid FLAG tag at the N-terminus (Knappik and Pluckthun, 1994). The asterisk represents an amber codon. The genelll portion starts at 
position 250 of the wt genelll protein (glllp), thus avoiding extraordinarily long glycine linkers and, most importantly, any unpaired 
cysteine of glllp. The expression cassette is followed by a downstream terminator (t, pp ). The origins for phage replication and plasmid 
replication are as described in Ge et al. (1995). The chloramphenicol (cam) cassette is originally derived from pACYC184, but its 
expression strength has been modified by randomizing the promoter and selecting clones with optimal growth and selection properties 
(Krebber et al., 1995). F: detection and enrichment of binding scFv sequences by phage display. The scFv insert is displayed on the tip of 
filamentous phage whereas the genetic information encoding for the particular scFv fragment is packaged as single stranded DNA (ss 
pAKlOOscFv) in the phage interior. Panning of single-chain antibody displaying phages against the antigen allows the enrichment of 
functional antibody sequences. 
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3. Results 

3.7. Design features of the improved phage display 
system 

The reengineered phage display system and opti- 
mized methodology used in this work combines the 
following significantiy improved features. 

(i) In many cases, previously reported primer sets 
were too restricted to amplify either particular light 
or heavy chains (Table 2). Therefore, the set of 
mouse primers used in this study (Table 1) has been 
extended and optimized. It incorporates all mouse 
V H , V A and V K sequences collected in the Kabat data 
base (Kabat et al., 1991) and combines extended 
primer sets described by Kettleborough et al. (1993); 
0rum et al. (1993) and Zhou et al. (1994). 

(ii) The V L back primer set encodes a convenient, 
shortened version of the FLAG peptide, which intro- 
duces only three additional amino acids at the N- 
terminus of V L . In this way, the scFv can be easily 
detected and purified by a commercially available 
mAb (Knappik and Pliickthun, 1994; Ge et al., 1995; 
Kalinke et al., 1996). 

(iii) To minimize PCR errors, polymerases with 
proof-reading capacity are used whenever possible 
(Marks et al., 1991; Yamanaka et al., 1995). 

(iv) The scFv fragment is efficiently assembled by 
SOE-PCR (splicing by overlap extension; Horton et 
al., 1989) from two (Ge et al., 1995; Vaughan et al., 
1996) rather than three pieces (Clackson et al., 1991; 
0rum et al., 1993; recombinant phage antibody sys- 
tem (Pharmacia)). 

(v) To avoid the occurrence of incorrect overlaps 
during assembly PCR, the four (Gly 4 Ser) repeats in 
the single chain linker region are encoded by differ- 
ent codons (Table 1; Ge et al., 1995). In order to 
reduce the dimerization or aggregation tendency of 
scFv fragments (Desplancq et al., 1994; Huston et 
al., 1995), the linker between V L and V H is 20 
amino acids in length rather than the frequently used 
15 amino acids long variant. 

(vi) Sfil is the only enzyme used for directional 
cloning of scFv fragments into the optimized phage 
display vector pAKlOO (Fig. IE). The use of this 
enzyme has a number of distinct advantages: Sfil 
recognizes eight bases, interrupted by rive non-re- 
cognized nucleotides (5'-GGCCNNNNNGGCC-3'). 



Sfil restriction sites are therefore very rare in anti- 
body sequences, thus elimination of potentially inter- 
esting sequences by internal digestion is very un- 
likely. Two different sticky ends were designed to 
allow cloning of the scFv fragment in a directional 
manner. In contrast to the palindromic sticky ends, 3 
bp overhangs derived from Sfil sites render impossi- 
ble self-dimerizarion by either insert or vector. Fi- 
nally, Sfil has the interesting property that it always 
cuts two sites at once (Wentzell et al., 1995), and 
therefore single-cut plasmids or inserts do not occur 
as intermediates. Digestion of vectors or inserts with 
single Sfil sites requires the binding of two different 
DNA molecules to the restriction enzyme and slows 
the turnover rate (Wentzell et al., 1995). While 
vectors with asymmetric Sfil sites have been de- 
scribed (Zelenetz, 1992; Barb as and Wagner, 1995; 
Yang et al., 1995), surprisingly this feature has not 
been used in antibody library cloning. Usually only 
one Sfil site, mostly in combination with NotI as a 
second site, is used (Hoogenboom et al., 1991; 0rum 
et al., 1993; Vaughan et al., 1996). Other systems 
employ a set of four enzymes to clone V L and V H 
independently of each other (Orlandi et al., 1989; 
Barbas et al., 1991; Johansen et al., 1995; Yamanaka 
et al., 1995). These systems thus run a significantiy 



1 2 3 M 4 5 6 




Fig. 2. DNA gel. PCR amplification of V L and V H from mouse 
hybridomas 13 AD and 42PF using the primer sets listed in Table 
1. Lane 1: V H mAb 42PF; lane 2: V L mAb 42PF; lane 3: 
assembled scFv 42PF; M: 1 kb marker (Gibco); lane 4: V H mAb 
13 AD; lane 5: V L mAb BAD; lane 6: assembled scFv 13AD. A 
1.5% agarose gel is shown. 
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higher risk of cutting in antibody genes and they also 
incorporate internal restriction sites in the variable 
region genes that create mismatches with the anti- 
body template and bias amplification by poor primer 
hybridization. Furthermore, Avrll, Sad and Spel 
sites, which are present in most sets of enzymes used 
to date, cut in the majority of mouse A chains and 
are therefore not suitable for simultaneous cloning of 
A and #c light chains. 

(vii) A frequently observed phenomenon is the 
contamination of antibody libraries with uncut recip- 
ient vectors (Courtney et al., 1995; Johansen et al., 
1995). Normally antibody-free vectors have a growth 
advantage over scFv-encoding ones and cause prob- 
lems during the enrichment of antigen-binding anti- 
body sequences by phage display. Therefore, the 
pAKlOO vector (Fig. IE), with a tetracycline resis- 
tance cassette (tetA and tetR; 2101 kb) inserted 
between the two different Sfil sites is used as the 
recipient. Digestion with Sfil yields a linearized 



vector which can be easily separated from the uncut 
one by gel electrophoresis. The loss of tet resistance 
can further ensure complete cutting of the recipient 
vector. 

(viii) In order to avoid immunity to superinfection 
(Stengele et al., 1990), which is caused by expres- 
sion of fusion proteins containing full length glllp, it 
is beneficial to use a truncated version of glllp. The 
truncated gill 250-406 (Fig. 3B; Lowman et al., 
1991), which is shorter than the more commonly 
used gill 198—406 version, was chosen in order to 
eliminate a long glycine/serine rich linker stretch 
that favored instability. More importantly, the un- 
paired cysteine 201 at the end of the N-terminal 
domains, which reduces the folding yield of anti- 
body-gin fusions (data not shown) was also removed 
by this approach. Given that even background ex- 
pression of truncated pffl fusions has been shown to 
be able to suppress superinfection to some extent 
(0rum et al., 1993), it is furthermore an important 



Table 2 



Summary of cloned hybridomas 



Hybridoma cell line 


13 AD 


42PF 


3D5 


MOC31 


3D7 


Isotype 


A, IgGl 


k, IgG2b 


k, IgG2b 


K, IgGl 


k, IgGl 


Tumor cell fusion partner 


X63Ag8.653 


X63Ag8.653 


X63Ag8.653 


X63Ag8.653 


X63Ag8.653. 


Antigen 


LZ 


LZ(7P14P) 


(his) 5 tag 


EGP-2 


DDT 


Binders without panning 


0/20 


2/14 


4/12 


0/22 


3/10 


Binders after two panning rounds 


5/6 


nd 


nd 


8/10 


nd 


Identified aberrant or 


aVH13AD.l 


aVL42PF.A 


nd 


nd 


nd 


non-binding sequences 


aVH13AD.2 










PCR amplification by: Pharmacia Primer Mix 












No 


No 


nd 


nd 


No 


v H 


Yes 


Yes 


nd 


nd 


Yes 


Primers derived from Orlandi et al., 1989 












nd 


nd 


nd 


No 


nd 


v H 


nd 


nd 


nd 


aVHref 


nd 



Five hybridomas of three different isotypes have been cloned according to the scheme outlined in Fig. 1 and Fig. 9. For all hybridomas 
X63Ag8.653 (Kearney et al., 1979) was used as the tumor cell fusion partner. Hybridoma 13AD and 42PF produce antibodies directed 
against leucine zippers (Leder et al., 1995), 3D5 against C-terrninal his tags (Lindner et al., 1997), MOC31 against the epithelial 
glycoprotein-2 (Souhami et al., 1988) and 3D7 against a derivative of DDT (Burgisser et al., 1990). Functional binders (signal > 10 times 
background) are identified by phage ELISA as described in Section 2. The amino acid sequences of all identified aberrant chains are listed 
in Fig. 5. The sequences of aVH13AD.l and aVHref are identical (except for amino acid 56; Fig. 5) to the aberrant V H sequence published 
by Kiitemeier et al. (1992). The same sequence was exclusively found (three independently sequenced clones) during V H amplification of 
hybridoma MOC31 using primers derived from Orlandi et al. (1989), whereas V L could not be amplified using this primer set. 
Amplification of V L using the commercially available primer mix of Pharmacia failed in the case of hybridomas 42PF and 3D7, probably 
because appropriate sequences are absent from this mix, as well as for hybridoma 13 AD, due to its A isotype (nd = not determined; no = no 
PCR product detected; yes = PCR product detected). 
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pAKlOOscFv, P AK200scFv, P AK300scFv 

end lad 

. • • ATGCAGCTGGCACGACAGGTTTCO^ 

...MQLARQVSRLESGQ* 

tnp terminator' 
GGTACCCGATAAAAGOGt CT ICCl^gAf^ 



CAP binding site 
CCACCTCAACGCAATTAATGTC^^ 



"35 * -10 lac-operator 

CTTTACACTTTATGCITCCGGCTCGTATGTTGTG 

l-> mRNA 

SD1 LacZ 
TftACAATTTCACACAGGJ^AACAGCTATGACC^ 

rfTMITNP* 

SD2 pelB signal sequence 

TAACSASQGCAAATCATGAAATACCTATTGCCTACGGCAGCCGCTGGATT 
MKYLLPTAAAGL 

Sfil FLAG VL 

GTTATTACTCGCgaCCCAGCCGGCCATGGCGGACTACAAACAY . . . 
LLLAAQPAMADYKD ... 



pAK500scFv 

VH Sfil EcoRI dHLX 
C!SK2CC/TCGGGGi5<2^3AATTCCCCAAACCrA 

ASGAEPPKPSTPPGS 



GCAGTGGTGAACTGGAAGAGCTGCTTAAGCATCTTAAAGAACTTCTGA 
SGBL EELLKHLKELLK 



GGCCCCCGCAAAGGCGAACTCGAGGAACTGCTG 

GPRKGELEELLKHLKBL 

his tag 

GCITAAAGGTGGGAGCGGAGGCGCGCCGCACCATCATCACC^ 
LKGGSGGAPHHHHH* 

Hindi 1 1 
TAAGCTT. . . 



pAK600scFv 

VH Sfil EcoRI alkaline phosphatase (AP) 
. . . COG£CTCGGGG£Ci^GAATTCCGGACACCAGAAATXKrCTGTTCTG . . . 
ASGAEFRTPEMPVL ... 
start AP 



F Y T M K 



L G L 



pAK400scFv 

SD2 pelB signal sequence 

- . . GA a AG£AGATATACATATGAAATACCTATTGCCTACGGCAGCC . . . 
T7gl0 MKYLLPTAA .,, 



pAKlOOscFv 

VH Sfil EcoRI myc tag 
. . . Cg^CTCGGGGGCCGAATTCGAGCAGAAGCTGATCTCTGAGQAAnAr 
ASGAEFEQKLISEED 

genelll 250-406 
£^TAGGGTGGTGGCTCTGGTTCCGGTGATTTTGATTATGAAAAG . . . 
L*GGGSGSGDFDYEK ... 



pAK200scFv 

VH Sfil genelll 250-406 

• . • CSG£CTCGGGGG£QGAGGGCGGCGGTTCTGGTTCCGGTGATTTT . 
... ASGAEGGGSGSGDY . 



pAK300scFv, pAK400scFv 

VH Sfil his tag 

CG^CXTCGGGGGCCGATCACCATCATCACCATCATTAGT . 

ASGADHHHHHH* 



K * 

end AP 



Fig. 3. A: upstream sequence of pAKlOOscFv, pAK200scFv, 
pAK300scFv and pAK400scFv. The region from the end of the 
lad repressor gene to the beginning of the antibody V L domain is 
shown. The lacl repressor gene, t HP terminator sequence, CAP 
binding site, lac promotor/operator region (lac p/o) including 
the -35 and —10 sequence, Shine-Dalgarno (SD) sequence of 
lacZ (SD1), lacZ peptide, a second SD sequence (SD2), pelB 
signal sequence, N-terminal Sfil site, four amino acid FLAG tag 
and the start of the V L domain (bold) are indicated above the 
sequence. For pAK400, the 15 bp upstream from the pelB start 
codon are replaced by a sequence including the SD sequence of 
the phage T7 gene 10. B: downstream sequence of pAKlOOscFv, 
pAK200scFv, pAK300scFv and pAK400scFv. Relevant differ- 
ences in the downstream sequences of pAKlOO, pAK200, pAK300 
and pAK400 are shown. The last two bases of V H (bold), Sfil and 
EcoRI restriction sites, myc or his tags and the start of 
geneIII(250-406) are indicated above the sequence. The stop 
codons are represented by asterisks. This corresponds to the 
region of the right-hand Sfil site in Fig. 4. C: sequences of 
EcoRI/Hindlll fusion cassettes as used in pAK500 and pAK600. 
The dHLX dimerization motif was taken from Pack et al. (1993). 
The complete sequence of the mature E. coli alkaline phosphatase 
(AP) gene can be found in Shuttleworth et al. (1986). In order to 
provide a EcoRI/Hindlll cloning cassette, the two internal EcoRI 
sites of the AP-gene have been removed by silent mutations (A. 
Knappik, unpublished data). 



improvement to engineer the system such that com- 
plete product repression prior to helper phage infec- 
tion can be ensured (see below). 



(ix) A strong upstream t^ terminator (Nohno et 
al., 1986) was incorporated between the lad gene 
and the lac promoter region of pAKlOO (Fig. IE, 
Fig. 3A). This t^ terminator sequence, in combina- 
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tion with glucose repression of the lac promoter (De 
Bellis and Schwartz, 1990), completely abolishes 
background expression before induction (for details 
see Krebber et al., 1996a). By these measures, selec- 
tion against toxic scFv-gEQ fusion proteins is avoided 
during propagation steps and plasmid maintenance is 
thus significantly improved. 



(x) The lac repressor is encoded on the phagemid 
to ensure strain independent lac promoter repres- 
sion. 

(xi) Combining a synthetic SD sequence with a 
pelB signal sequence (Fig. 3 A) leads to an only 
moderate level of translation allowing a low level of 
scFv-gin expression upon induction by IPTG. 
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Fig. 4. pAK vector series. Phage display vector pAKlOO (A) and related vectors (B-F) are useful to build modifications into antibody 
fragments cloned by the strategy outlined in Fig. 1 and Fig. 9. pAK200-pAK600 contain the same elements as described for pAKlOO except 
for the modified cassettes shown. All vectors contain a tet resistance cassette (tetA and tetR\ 2101 bp) to facilitate the monitoring of Sfll 
cutting. The in-frame fusion to geneIII250-406 using pAKlOO (A) leads firstly into a myc-tag (Munro and Pelham, 1986) to be used as a 
detection handle, followed by an amber codon (asterisk). Depending on the strain used it is possible to switch between soluble expression of 
scFvs (in the case of non-suppressor strains such as JM83) and expression of scFv gene3 fusions (with suppressor strains such as 
XL1 -Blue). The direct in-frame fusion to geneIH25 0-406 as in pAK200 ( B) lacks the EcoRI site, myc-tag and amber codon. C-terminal his 
tag fusions can be used for purification by IMAC as well as for detection by an anti his tag antibody (Lindner et al., 1992, 1997; Kalinke et 
al., 1996) (C tl D). Fusion partners, including helices for dimerization (Pack et al., 1993) (E) and alkaline phosphatase (AP) for direct 
detection of antigens by dimerized APscFv fusions (Lindner et al., 1997) (F), can be added to pAKlOO or pAK400. The expression strength 
of either antibody or antibody fusions can be enhanced by replacing the original Shine Dalgarno (SD2) sequence by the stronger SDT7glO, 
as carried out in pAK400 (D). 
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(xii) The insertion of an amber codon upstream of 
AgenelH in pAKlOO, as described by Hoogenboom 
et al. (1991) and Lowman et al. (1991), allows 



switching between expression of membrane an- 
chored scFv-geneDI250-406 fusion proteins and sol- 
uble scFvs simply by changing the expression host 
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It must also be taken into account, however, that 
amber suppression is not complete and for this rea- 
son, an analogous phage display vector was con- 
structed (pAK200; Fig. 4B), lacking the stop codon 
of pAKlOO. This leads to a higher proportion of 
displayed scFv-genein fusion protein, as monitored 
by ELISA and Western blot (data not shown). Direct 
competition of the same scFvs cloned into pAKlOO 
or pAK200, however, reproducibly results in a com- 
plete enrichment of the pAKlOO vector type after 
one round of phage panning (data not shown). This 
suggests that the lower level of fusion protein ex- 
pression is a selective advantage. 

(xiii) Chloramphenicol resistance (Kang et al., 
1991) was used as the selective marker because it 
was found to allow more stringent selection than 
attainable with ampicillin, since the resistance pro- 
tein does not leak into the medium. Furthermore, the 
use of chloramphenicol is advantageous over that of 
kanamycin or tetracycline because it does not reduce 
the phage titer as much (Johansen et al., 1995; 
Krebber et al., 1995; Krebber A., unpublished obser- 
vations). 

(xiv) The procedure can be used for library 
cloning, e.g. the repertoire from immunized mice, as 
well as for cloning of single sequences from hy- 
bridomas in a similar way. Furthermore, it is directly 
compatible with the recently introduced selectively 
infective phage system (SIP), which allows in vivo 
and in vitro selection of cognate protein/ligand in- 
teractions by strictly coupling the infectivity of fila- 
mentous phages to the binding event (Krebber et al., 
1995; Krebber et al., 1996b). 

(xv) Since the optimized phage display vector 
pAKlOO is engineered to achieve low levels of ex- 
pression, it is not a useful large scale production 



system for well folding and soluble antibody frag- 
ments. Thus, a compatible high-level expression 
plasmid has also been engineered. (Fig. 3A, Fig. 
4D). 

(xvi) A compatible vector series that facilitates 
various modifications of scFv fragments subsequent 
to cloning into pAKlOO is also available (Fig. 3B, 
Fig. 3C, Fig. 4C-F; see also Ge et al., 1995). 

3.2. Amplification of V region genes and assembly 
into the scFv format 

Tne ,V L back primer mix (LB 1-17 and LB A, 
representing a total of 131 variants) paired with five 
V L forward primers (LF1, 2, 4, 5 and LFA) and the 
V H back mix (HB1-19, representing a total of 94 
variants) paired with four V H forward primers (HF1- 
4) have been used to amplify V L and V H domains 
from a variety of antibody cDNAs (Table 2). 

Our improved primer set (Table 1) has been tested 
in different laboratories on cDNA derived from 12 
hybridoma cell lines of different specificities and 
family sub-types to date. In all cases, the first PCR 
amplification yielded sufficient amounts of products 
for cloning, with a sharp band at the predicted size of 
375-402 bp for V L or 386-440 bp for V H . Typical 
examples of V H and V L genes amplified from cDNA 
of two hybridomas, 42PF and 13 AD, which secrete 
monoclonal antibodies directed against leucine zip- 
pers (Leder et al., 1995), are shown (Fig. 2). Using 
the same cDNA in combination with a commercially 
available primer mix (recombinant phage antibody 
system (Pharmacia)) or primers derived from Orlandi 
et al. (1989), amplification of V L failed in several 
cases (Table 2), underlining the importance of an 
extended primer mix. 



Fig. 5. Sequence alignment of functional and aberrant variable domains expressed by the hybridoma cell lines 13 AD and 42PF. Residue 
numbers are according to Kabat et al. (1991). The 7 amino acids at each end are encoded by the PCR primer sequences. A: V A amino acid 
sequences. VL42PF.A: non-binding V A found in clone 42PF; identical to germline V A 1 sequence (Weiss and Wu, 1987) except for F92Y. 
VL13AD (X99507): functional, antigen-binding V A sequence of hybridoma 1 3 AD. B: V K amino acid sequences. VL42PF.#c (X99509): 
functional, antigen-binding sequence of clone 42PF. aVLref: aberrant V K transcript found in P3X63Ag8.653 (Carroll et al., 1988 (M35669); 
Duan and Pomerantz, 1994; Cabilly and Riggs, 1985; Strohal et al., 1987; Yamanaka et al., 1995). C: V H amino acid sequences. VH42PF 
(X99508): functional V H of hybridoma 42PF. VH13AD (X99506): functional V H of hybridoma 13AD. aVH13AD.l: aberrant V H .l found in 
clone 13 AD, showing a frameshift in CDR3. aVHref: non-functional V H published by Yamanaka et al. (1995); this sequence is identical to 
the unpublished result of Mocikat (D50398). aVH13AD.2: aberrant V H .2 found in 13 AD. Sequence is different to aVH13AD.l, and also 
contains a frameshift in CDR3. EMBL accession numbers are given in brackets. 
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For further analysis the V L and V H PCR products 
of 42PF and 13 AD have been cloned separately into 
the pCR-Script vector (Stratagene) and sequenced. 
For 42PF two plausible light chain sequences devoid 
of frames hi fts, stop codons, deletions or atypical 
amino acids for murine V L domains (VL42PF./C, 
VL42PF.A) were found, together with one heavy 
chain sequence (VH42PF) (Fig. 5). For hybridoma 
13 AD only 3 of 57 clones analyzed contained a bona 
fide functional heavy chain gene, denoted VH13AD, 
whereas two additional non-functional heavy chain 
sequences aVH13AD.l (five clones) and 
aVH13AD.2 (49 clones) were found (Fig. 5). Both 
heavy chains are aberrantly rearranged at the DJ 
recombination site in CDR3 and contain several 
framework amino acids which deviate from the ob- 
served consensus of antibody sequences (Fig. 5). 
Sequencing of five V L chains, amplified exclusively 
by the A primer pair LBA/LFA, yielded a unique 
sequence denoted VL13AD (Fig. 5). Thus, both the 
13 AD and 42PF hybridoma produced more than one 
PCR-amplifiable heavy or light chain. 

As outlined in Fig. 1, all amplified V L and V H 
domains have been linked by SOE-PCR, as shown 
for 13 AD and 42PF (Fig. 2). These were subse- 
quently digested by Sfil and ligated into the im- 
proved phage display vector pAKlOO. 

3.3. Screening and enrichment of Junctional scFv 
sequences derived from hybridomas 

After transformation of the ligation reaction into 
the recombination deficient E. coli strain XLl-Blue, 
10-22 individual colonies were grown separately 
and infected by helper phage as described in Section 
2. The recombinant scFvs, displayed on the surface 
of filamentous phage, were tested for antigen binding 
in a typical phage ELISA. In those cases where the 
parental hybridoma cell line did not produce large 
amounts of contaminating, non-functional light or 
heavy chain, about one third of the screened colonies 
contained the sequence information of the binding 
scFv fragments (examples are hybridoma 42PF am- 
plified with a V L #c mix, devoid of A primers and 
hybridomas 3D5 and 3D7; Table 2). At the other 
extreme, an initial screening of phages derived from 
individual colonies of 13 AD did not yield any func- 
tional binders. As previously demonstrated by the 



sequencing of individual V H domains, functional 
sequences are greatly diluted by aberrant heavy 
chains in this hybridoma cell line. In order to iden- 
tify and enrich functional binders, 10 5 E. coli 
colonies were pooled after transformation and sub- 
jected to two rounds of phage panning. After each 
round, six clones were tested for antigen binding in a 
phage ELISA. Two of six and five of six clones from 
the first and second panning rounds, respectively, 
were found to be positive for antigen binding. All 
positive clones had identical sequences (VL13AD 
paired with VH13AD), whereas all non-binding scFv 
sequences contained the aberrant heavy chains 
aVH13AD.l or aVH13AD.2, occasionally in combi- 
nation with point mutations in the light chain gene. 
Clones which were found to be positive in phage 
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Fig. 6. Competition phage ELISA (13AD, 42PF). Competition 
phage ELISA with phages displaying functional and non-func- 
tional scFv fragments derived from hybridomas 13AD and 42PF. 
The ELISA was performed as described in Section 2. The non-bi- 
nding 13ADscFv clone contains the aberrant aVH13AD.l chain 
(Fig. 5C) and the functional VL13AD chain (Fig. 5 A). For 
inhibition, phages were preiocubated for 10 min with 10" 4 M 
soluble peptide antigen before applying the mixture to the 
antigen-coated plate. As a negative control, an assay with VCS 
helper phage was performed. 
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Fig. 7. Enrichment by panning. A: cloning of hybridoma MOC31. 
Enrichment of EGP-2 binding scFv by phage panning. B: reper- 
toire cloning. Enrichment of ampicillin binding scFv displaying 
phages derived from anti-ampicillin library I. Phage pools (5 • 10 10 
cfu/well) prepared after 0, 1, 2 and 3 rounds of phage panning 
(column label), as well as VCS helper phage as a negative control, 
were tested for specific antigen binding in a phage ELISA as 
described in Section 2. Selective enrichment was also indicated by 
an increased number of phages eluted in subsequent panning 
rounds (data not shown). 

ELISAs were further characterized by antigen inhibi- 
tion studies to verify that the binding was antigen- 
specific (Fig. 6). In the case of hybridoma MOC31 
(Souhami et al., 1988), again no binders were ini- 
tially obtained, and two to three rounds of phage 
panning were required to enrich binding scFv frag- 
ments to a level that allowed identification of func- 
tional antibody sequences in individual clones (Fig. 
7 A; Table 2). This shows that the relevant sequences 
of numerous 'monoclonal* antibodies can be hidden 
in a pool of closely related antibody-like sequences 
and that, in the absence of panning, rigorous testing 
would be required in order to identify the correct 
sequence. 

3.4. Cloning of the antibody response from immu- 
nized mice 

The procedure described for hybridoma cloning 
was also applied to mRNA isolated from spleen cells 
of an immunized mouse. In addition, B-cells from 
the same mouse were fused to the tumor cell line 
X63Ag8.653 as in the case of monoclonal antibody 
production, but were kept as a pool for 10 days. This 
pool of hybridomas was subsequently used as a 
source of mRNA. The latter experiment was carried 
out because it might seem conceivable that B cells 
which have been stimulated by the antigen fuse 
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preferentially (Kohler and Milstein, 1976). From fu- 
sion experiments, only a small number of a few 
thousand candidate clones is typically obtained, of 
which a high proportion usually codes for antigen 
binding antibody sequences. Since productive pairs 
of V L and V H domains are separated during the 
cloning process and are subsequently combined ran- 
domly to form scFvs, fairly large libraries are neces- 
sary to ensure that all original V L and V H pairings 
are represented (Gherardi and Milstein, 1992; Posner 
et al., 1994). A comparison of anti-ampicillin li- 
braries derived from fused (library I) and unfused 
B-cells (library II) of the same immunized mouse 
should determine whether cell fusion prior to mRNA 
preparation is an advantageous enrichment step which 
enhances the probability of restoring functional 
V L /V H pairings in a small library. As outlined in 
Table 3, both libraries contained binding scFv frag- 
ments which could be enriched with a similar effi- 
ciency after two or three rounds of panning (Fig. 
7B). Sequencing revealed that the same sequences 
were isolated simultaneously from both libraries (data 
not shown), indicating that B cell fusion to tumor 

Table 3 



Selection of ampicillin binding scFv fragments from B cell reper- 
toires 





Library I 


library II 


Source of RNA 


Spleen cells fused 


B cells 




to tumor cell line 






X63Ag8.653 




Antigen 


Ampicillin 


Ampicillin 


Library size 


4 10 6 


610 6 


Clones containing Sfil insert 


20/20 


20/20 


Clones expressing scFv 


22/30 


18/30 


Binders before panning 


0/12 


0/12 


Binders after two rounds 


5/12 


nd 


Binders after three rounds 


19/24 


9/12 



Spleen cells derived from the same B ALB /c mouse immunized 
with ampicillin were taken for library construction before (library 
II) and after (library I) fusion to the tumor cell line X63Ag8.653. 
Both libraries were transformed into XLl-BIue cells by electropo- 
ration. The amount of Sfil insert-containing clones in the initial 
library was monitored at the DNA level by restriction analysis 
whereas the amount of full length scFv-expressing clones was 
analyzed by Western blot analysis, using the N-terminal FLAG 
detection system combined with C-terminal myc tag detection 
(data not shown). Binding scFv fragments were identified by 
phage ELISA. The enrichment process was followed by ELISA 
using phage pools as shown in Fig. 7B. 
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cells prior to mRNA preparation, at least in our 
experience, has no significant beneficial influence on 
library composition. 

3.5. Soluble expression and modification of cloned 
scFv fragments 

A scFv fragment obtained from the anti-ampicillin 
library I (scFvaL2) was sub-cloned into pAK300 and 
pAK400 (Fig. 4) for soluble expression in JM83. In 
comparison with the low expression medium (LE 
medium) used for phage display, changing to an 
expression medium devoid of glucose immediately 
increases the expression level of recombinant protein 
without any modifications to the vector system (De 
Bellis and Schwartz, 1990). Changing the translation 
initiation region present in pAKlOO or pAK300 into 
a much stronger Shine-Dalgarno sequence 
(SDT7glO), such as that present in pAK400 (Fig. 
3A, Pluckthun et al., 1996), results in a further 
significant enhancement of protein expression. As 
shown in Fig. 8, the expression level strongly influ- 
ences the ratio of soluble to insoluble scFv protein. 
While at the lower expression level of 
pAK300scFvaL2 100% of the scFv is soluble and 
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Rg. 8. Enhanced expression of aL2 in pAK400. The scFvaL2 was 
expressed in JM83 harboring pAK300scFvaL2 or pAK400scFvaL2 
(Fig. 3 and Fig. 4). Expression levels were monitored by Western 
blot analysis as described in Section 2. c: whole culture (soluble 
fraction, insoluble fraction and culture supernatant), where the 
loaded sample corresponds to 1 ml of culture OD^o of °.01; i: 
insoluble fraction; and s: soluble fraction, where the loaded sam- 
ple corresponds to 1 ml culture at an OD^ of 0.05. 
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Fig. 9. Outline: Generation of scFv antibodies from hybridomas. 
A flow diagram summarizing the most important steps is shown. 



functional, the enhanced expression in 
pAK400scFvaL2 causes production of more soluble 
but also large amounts of insoluble material (Fig. 8). 

The scFvaL2 has comparatively favorable folding 
properties (A. Krebber, unpublished). For scFv frag- 
ments, which are already mainly found in the insolu- 
ble fraction after expression in pAKlOO or pAK300, 
sub-cloning into pAK400 does not improve the yield 
of functional antibody fragment (data not shown). 
Instead, cell lysis problems, caused by such poorly 
folding proteins, are enhanced in the stronger expres- 
sion vector since higher expression levels normally 
lead to a higher proportion of non-functional aggre- 
gates (Le Calvez et al., 1995), which are likely to 
impair growth of the expression host. Therefore, it 
has proved to be advantageous to adapt the expres- 
sion level to the particular scFv sequence which has 
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to be expressed by the choice of vector and induction 
conditions. 

Moreover, Sfil cassettes of scFv fragments can be 
fused directly in frame with oligo-histidine tags for 
purification by IMAC (Lindner et al., 1992), with 
dimerization or tetramerization modules to obtain 
dimeric or multimeric scFv antibodies (Pack et aL, 
1993, 1995) and with enzymes such as alkaline 
phosphatase to produce dimeric scFv molecules 
which can be detected directly by virtue of their 
enzymatic activity (Lindner et al., 1997; Fig. 3B and 
C and Fig. 4). If IMAC does not directly yield pure 
scFv antibody, an immunoaffinity column using an 
anti-FLAG mAb can also be employed (Kalinke et 
al., 1996). Combination of the N-terminal FLAG tag 
and the C-terminal myc or his tags allows monitor- 
ing of full length scFv product formation, since 
antibodies against all three tags are available (Munro 
and Pelham, 1986; Knappik and Pliickthun, 1994; 
Lindner et ah, 1997). Therefore, Western blot detec- 
tion of N- and C-terminal degradation and of prote- 
olysis of particular scFv sequences becomes easily 
possible. The whole spectrum of compatible modifi- 
cation cassettes (see also Ge et al., 1995; Pliickthun 
et al., 1996) combined with the pAK vector series 
creates a highly versatile system, allowing easy char- 
acterization and further genetic engineering of scFv 
fragments initially obtained. 

4. Discussion 

The improved phage display system based on the 
pAK vector series (Fig. 4), an extended primer mix 
(Table 1) and a very straightforward cloning proce- 
dure (Fig. 1) proved to be robust and reliable both in 
a library setting and for hybridoma cloning. Follow- 
ing the scheme outlined in Fig. 9 all hybridomas 
tested to date could be cloned, characterized for 
functional antigen binding and sequenced with a 
reasonable effort, in as few as 10 days (hybridoma 
3D5). 

The optimized phage display system was suitable 
for eliminating high amounts of non-functional chains 
that are transcribed from various aberrant mRNAs in 
some hybridoma cell lines. In contrast to other meth- 
ods (Nicholls et al., 1993; Duan and Pomerantz, 
1994; Ostermeier and Michel, 1996), only functional 



and binding antibody genes will be sequenced. After 
RNAseH treatment of aberrant RNA/DNA hybrids 
(Ostermeier and Michel, 1996), eight out of 12 se- 
quenced clones were still derived from aberrant 
pseudogenes, whereas without mRNA treatment all 
nine clones tested carried pseudogenes. Duan and 
Pomerantz (1994) used ribozyme treatment to im- 
prove the ratio of aberrant to functional sequences 
from three positive clones in 150 to 12-34 in 150. 
Both methods depend on the availability of sequence 
information of the aberrant chain prior to cloning, 
whereas phage display simply enriches binding se- 
quences over any kind of contaminating chain. This 
has been found to be particularly important, as both 
of the hybridoma cell lines we have characterized in 
detail (13 AD and 42PF) produced aberrant mRNAs 
which seem to be specific for the individual hybrido- 
mas and could not be found in the published litera- 
ture or any database (Fig. 5). The origin of some of 
the aberrant mRNAs could not be traced to the 
myeloma cell lines originally utilized for cell fusion. 
Hybridoma 13 AD, for example, contained three dif- 
ferent heavy chain mRNAs, of which only one is 
known to be derived from the tumor cell line 
X63Ag8.653. It seems plausible that many additional 
non-binding chains originate from aberrant rear- 
rangements of the second allele of the B-cell in- 
volved in the generation of the particular hybridoma. 
Termination of rearrangement in the immuno- 
globulin loci takes place only after synthesis of a 
functional membrane-bound immunoglobulin. Thus, 
a given B-cell may produce aberrant mRNAs that 
contain stop codons or frameshifts in addition to the 
functional mRNA that is translated into the mature 
immunoglobulin chain. Furthermore, some hybrido- 
mas may be the result of the fusion of more than one 
B-cell with the myeloma cell. This gives rise to the 
possibility that more than one typical heavy and light 
chain gene is expressed, as observed for hybridoma 
42PF. Such typical in-frame but non-binding se- 
quences cannot be distinguished from the binding 
chain by sequencing. This underlines the importance 
of a functional test involving antigen binding in 
order to avoid the risk of isolating an incorrect 
sequence. 

When this is taken in combination with possible 
PCR errors or mutations introduced by the primer 
mix, it can be envisaged how cloning of a hybridoma 
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can very easily generate a diverse collection of dif- 
ferent scFv fragments. Thus, functional screening is 
always superior to sequence analysis of individual 
clones. 

If more than one monoclonal antibody against the 
same antigen is available, pooled hybridoma cell 
lines can be used as a source for mRNA extraction, 
as carried out in the case of the anti-ampicillin 
library I. This allows for fast and inexpensive simul- 
taneous cloning of several different antibodies in one 
experimental setup. Alternatively, a small fraction of 
B-cells prepared for traditional monoclonal antibody 
generation can be set aside for phage library con- 
struction. It has been reported, however, that parallel 
screening of hybridomas and phage libraries, as de- 
scribed by Gherardi and Milstein (1992); Kettlebor- 
ough et al. (1994) and Ames et al. (1995), has led to 
the discovery of different antibody sequences from 
the two sources. This may simply reflect the fact that 
neither hybridoma generation nor phage libraries 
provides an exhaustive sampling of the immune re- 
sponse. The findings may suggest in addition that 
recombinant expression of antibody fragments com- 
bined with phage display selects against certain anti- 
body sequences (see also Riechmann and Weill, 
1993; Posner et al., 1994; Jackson et al., 1995). The 
absence of certain sequences in phage libraries might 
be due to insufficient library size, PCR amplification 
of only a subset of binding variable genes or selec- 
tion against phagemids expressing less stable or less 
well folding antibody fragments, particularly if they 
are incorporated into phage particles less frequently 
or impair growth of E. coli (Knappik and Pluckthun, 
1995; Krebber et al., 1996a). We believe, however, 
that our optimized cloning procedure and the tightly 
regulatable phage display vector will contribute to 
overcoming some of these biases and will therefore 
facilitate the construction of more diverse antibody 
libraries. Given the stress that antibodies fused with 
glUp impose on the cell, a high expression level is of 
no importance and actually serves as a burden for 
phage display. In contrast, the absence of back- 
ground expression before induction is of utmost im- 
portance if loss of clones from the library or gene 
deletions are not to quickly accumulate. Thus, the 
phage display vector pAKlOO was optimized by 
lowering the expression level for phage display and, 
more importantly, by eliminating background expres- 



sion before helper phage infection (Krebber et al., 
1996a). In contrast to many previous vector con- 
structs, therefore, even in a library setting, no dele- 
tions, empty vectors, recombination events or other 
symptoms of instability have been detected to date. 
Moreover we claim that it is advisable to sub-clone 
selected scFvs into related, but more powerful, ex- 
pression vectors subsequent to cloning, since unbi- 
ased library screening by phage display and maxi- 
mized functional production of a particular antibody 
fragment are likely to require different expression 
optima. 
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Abstract Avidin, a positively charged egg-white glycoprotein, is 
a widely used tool in biotechnological applications because of its 
ability to bind biotin strongly. The high p/ of avidin ( ~ 10.5), 
however, is a hindrance in certain applications due to non-specific 
(charge-related) binding. Here we report a construction of a 
series of avidin charge mutants with p/s ranging from 9.4 to 4.7. 
Rational design of the avidin mutants was based on known 
crystallographic data together with comparative sequence align- 
ment of avidin, streptavidin and a set of avidin-related genes 
which occur in the chicken genome. AD charge mutants retained 
the ability to bind biotin tightly according to optical biosensor 
interaction analysis. In most cases, their thermal stability 
characteristics were indistinguishable from those of the wild- 
type avidin. Our results demonstrate that the charge properties 
of avidin can be modified without disturbing the crucial biotin- 
binding activity. 

© 1998 Federation of European Biochemical Societies. 

Key words: Avidin; Protein engineering; Charge mutant; 
Avidin-biotin technology 



1. Introduction 

Avidin is a basically charged, tetrameric glycoprotein found 
in the chicken egg white. Throughout the years, avidin has 
become a frequently used tool in numerous biotechnological 
applications, including different localization, diagnostic and 
separation technologies [1], Recently avidin has also found 
its use in affinity- based targeting of drugs and imaging agents 
with promising results [2-5]. All these applications are gener- 
ally based on the high affinity ~ 10" 15 M) [6] avidin has 
for biotin, a Iow-molecular-weight vitamin, which can be 
readily attached to biologically active binders and detectable 
probes. This strong interaction with biotin, combined with the 
exceptional stability and four biotin-binding sites of avidin 
(one per subunit) has created the inherent utility and the ver- 
satility of the avidin-biotin technology. 

Avidin is, however, a positively charged glycoprotein (p/ 
~10.5) [6], which possesses eight arginine and nine lysine 
residues [7]. The high pi of avidin and the presence of carbo- 
hydrate residues have been a constant hindrance to its use in 
some applications, due to non-specific binding (mostly charge- 
related) to extraneous material. For this reason streptavidin, a 
non-glycosylated and neutrally charged bacterial counterpart 
of avidin [8], has become the preferred choice in such appli- 
cations. The preference of streptavidin over egg-white avidin 
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has prevailed, despite the fact that avidin is more hydrophilic, 
contains more lysine residues for potential attachment of 
probes, and is considerably more abundant and cheaper 
than streptavidin. 

In the present work, we wanted to investigate whether the 
pi of avidin can be reduced using protein engineering, without 
disturbing significantly the biotin-binding activity or the 
stability characteristics of avidin. We used sequence compar- 
ison of streptavidin [9] and recently cloned avidin-related (avr) 
genes [10], together with the crystallographic structure of avi- 
din [11,12] to design the changes. This approach has allowed 
us to generate a series of fully functional avidin mutants with 
p/s ranging from 9.4 down to 4.7. These reduced charge mu- 
tants bind biotin in a manner similar to that of wild-type 
avidin and also display clearly reduced non-specific binding 
characteristics. Therefore, this study also offers new possibil- 
ities for the applications of avidin-biotin technology. Prelimi- 
nary results of this work were presented at the 8th European 
Congress on Biotechnology, Budapest 17-21, Hungary, Au- 
gust 1997, Abstracts, p. 158. 

2. Materials and methods 

2.1. Site-directed mutagenesis and construction of recombinant 
baculoviruses 

Mutagenesis of avidin cDNA [13] was accomplished by the PCR- 
based megaprimer method [14] using Pfu DNA polymerase (Strata- 
gene, La Jolla, CA, USA). The oligonucleotides for mutagenesis were 
purchased from either KEBO Lab (Espoo, Finland) or from MedP- 
robe (Oslo, Norway). After digestion with BglH and Hindlil (Prom- 
ega, Madison, WI, USA), the PCR fragments were subcloned into the 
BamHUHindlll digested pFastBACl donor vector (Gibco-BRL, Gai- 
thersburg, MD, USA). The mutations were confirmed by double- 
stranded sequencing using Sanger's dideoxynucleotide chain termina- 
tion procedure with an automated DNA sequencer (ALF, Pharmacia 
Biotech). The recombinant baculoviruses were generated using the 
Bac-To-Bac baculovirus expression system according to the manufac- 
turer's instructions (Gibco-BRL). The primary virus stocks were am- 
plified for large-scale production of avidin mutants and the titers of 
the stocks were determined by a plaque assay procedure [15]. 

2.2. Expression and purification of avidin mutants 

Insect cells, Sf9 (ATCC CRL 1711), were maintained as a suspen- 
sion culture in serum-free Sf-900 II SFM medium (Gibco-BRL) and 
infected with recombinant baculoviruses at m.o.i. of 0.5-2 pfu/cell. 
The infection was allowed to proceed for 24 h, after which the culture 
medium was collected by centrifugation (lOOxg, 22°C, 5 min) and 
replaced with fresh biotin-free medium. After the medium change, the 
infection was continued for another three days. The purification of 
each avidin mutant was carried out by affinity chromatography using 
2-i mi nobiotin- agarose as previously described by Airenne et al. [16]. 

2.3. Protein analysis 

Electrophoretic analysis was carried out using 15% (w/v) SDS- 
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PAGE with discontinuous buffer system [17]. After electrophoresis, 
proteins were either stained with Coomassie brilliant blue or blotted 
onto nitrocellulose membrane for immunostaining according to 
Airenne et al. [16}. Isoelectric focusing was performed using polyacryl- 
amide gels with pH gradient ranging from 3 to 10. An aliquot of 
avidin mutants (5 ug) together with the pi standards (Bio-Rad) was 
applied to the gel, and following the run the proteins were visualized 
by Coomassie staining. The quaternary status of avidin was analyzed 
by FPLC on a Superose 12 column (Pharmacia) using an LKB HPLC 
system. A sample (20 ug in 100 ul of phosphate buffer with 0.65 M 
Nad, pH 7.2) was applied, and chromatography was carried out at 
a flow rate of 0.5 ml/min, using the same ionic strength in the equil- 
ibration and running buffers. The column was calibrated using bovine 
Y-globulin, BSA, avidin standard, ovalbumin, ribonuclease and cyto- 
chrome c as molecular weight markers. 

2.4. Interaction analysis of avidin mutants 

Binding kinetics were measured using optical biosensor technology 
(IASyS Manual+, Affinity Sensors, Cambridge, UK). The measure- 
ments (Req, in arc seconds, a measure of the mass on the surface) were 
carried out using either a commercial biotin cuvette (Affinity Sensors) 
or by immobilizing 2-iminobiotin onto the carboxymethyldextran cuv- 
ette using 7v*-hydroxysuccinimide activation. Binding of various con- 
centrations of avidin or avidin mutants onto 2-iminobiotin surface 
was measured in a 50 mM borate buffer (pH 9.5) containing 1 M 
NaCl at room temperature. The iminobiotin cuvettes were regenerated 
with 20 mM HQ. The measurements using the biotin cuvette were 
carried out using PBS with 1 M NaCl as a binding buffer at room 
temperature. The kinetic rate constants for association (k m ) and dis- 
sociation (& 0 ff) or the dissociation constant (A^) were calculated using 
the Fast Fit program package (Affinity Sensors). 

2.5. Thermal stability of avidin mutants 

Purified avidin mutants, in the presence or absence of an excess of 
biotin, were combined with sample buffer (0.125 M Tris-HCl, pH 6.8/ 
4% (v/v) SDS/20% (v/v) glycerol/0.004% (v/v) bromophenol blue/10% 
(v/v) 2-mercaptoethanol) and incubated at selected temperatures, be- 
fore being subjected to SDS-PAGE as described by Bayer et al. [18]. 
The gels were stained using Coomassie brilliant blue. The stability of 
the proteins was followed by dissociation of the tetramer to the mono- 
melic form. 

2.6. Non-specific binding assay 

Successive dilutions (1 ug, 200 ng, 40 ng and 8 ng) of a DNA 
sample (salmon-sperm DNA or pGEM plasmid DNA) were applied 
to nitrocellulose strips. The DNA was fixed to the strips by UV 
irradiation, after which the strips were quenched using 5 X Denhardt's 
solution. The sample (20 ug in 1 ml PBS) was then added to a strip 
and incubated at room temperature for 90 min. After that, the strips 
were washed with PBS/0.05% Tween 20 solution and stained immu- 
nochemically as described by Airenne et al. [16], 

3. Results 

3.1. Design of avidin mutants 

In order to reduce the positive charge of avidin, a series of 
six mutants was constructed with pi range from 9.9 down to 

4.7. as calculated theoretically from amino acid sequences us- 
ing the GCG program package (Genetics Computer Group, 



Madison, WI, USA). The alterations were done by changing 
selected basic amino acids to neutral or acidic ones using site- 
directed mutagenesis (Table 1) and the proteins were named 
according to the actual pi. The selection of amino acids for 
lowering the pi was based on sequence comparison of avidin, 
strep tavidin and avidin-related proteins (AVR), combined 
with the available crystallographic data of avidin. All the 
altered amino acids are surface residues and, according to 
structural information, have no major role in biotin binding 
or stability of the avidin tetramer. Where possible, we tried to 
change arginine residues rather than lysines, due to the im- 
portance of their terminal amino groups for derivatization of 
avidin in many applications. 

3.2. Expression and purification of avidin mutants 

We have shown previously [16] that avidin can be produced 
efficiently in Sf9 insect cells, using a baculovirus expression 
system. The expression of the avidin mutants was generally 
comparable to that of the wild type, although some correla- 
tion between production and the number of the mutations 
was evident - AvdpI9.4 and AvdpI7.9 being best in this re- 
spect (data not shown). All the mutants were purified to 95% 
purity (judged from SDS-PAGE gel) in one step, using affinity 
chromatography with 2-iminobiotin as a capturing ligand. 
For successful purification it was essential to use biotin-free 
culture medium; if biotin is present in the medium, it will 
block the binding sites of avidin/mutants thus hampering sub- 
sequent affinity purification and possible use of the avidin 
derivative in applied systems. 

3.3. Protein chemical analysis 

SDS-PAGE analysis showed that the native avidin and 
charge mutants separated into three components (data not 
shown), which presumably represented different stages of 
post-translational modification of glycosylated avidin [16]. 
There were minor changes in the migration pattern of certain 
mutants (especially AvdpI4.7), probably due to the extensive 
charge differences. The isoelectric points of the avidin mutants 
were determined with isoelectric focusing, using a pH gradient 
from 3 to 10 (Fig. 1). The experimental results corresponded 
well with the values theoretically calculated from the amino 
acid sequences (Table 1). The differences in the case of Avd- 
pI9.4 and AvdpI9.0 are probably due to poor resolution in the 
upper part of the pH gradient. The experimental value for 
avidin could not be determined with this system, since its pi 
is over 10, which is out of range of the gel system used in this 
study. The quaternary status of the charge mutants, in the 
presence and in the absence of biotin, was examined by 
FPLC on a Superose 12 column (data not shown). Compar- 
ison of the elution profiles with molecular mass standards 



Table 1 

The physiochemical properties of the avidin charge mutants 



Name 



Mutations 



Avd None 

AvdpI9.4 R122A, R124A 

AvdpI9.0 R26N, R59A 

AvdpI7.9 R2A, K3E, K9E 

AvdpI7.2 K3E, K9D, R122A, R124A 

AvdpI5.9 R2A, K3E, K9E, R122A, R124A 

AvdpI4.7 R2A, K3E, K9E, R26N, R59A R122A, R124A 



pi calculated 



pi experimental 



10.4 

9.9 
9.3 
8.1 
7.2 
5.9 
4.7 



n.d. 
9.4 
9.0 
7.9 
7.2 
5.9 
4.7 



Ad (My 

2.0X10" 8 
1.4X10" 7 
3.1 XlO" 8 
2.3X10 -8 
4.5 X10" 7 
3.1 X10" 7 
3.5 XlO" 8 



a Dissociation constant for 2-iminobiotin at pH 9.5; n.d., not determined. 
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1 2 3 4 5 6 M pi 




Fig. 1. Isoelectric focusing of avidin charge mutants. Samples (5 ug) 
of different charge mutants were applied to polyacrylamide gel with 
pH gradient from 3 to 10. Following the run, proteins were visual- 
ized by Coomassie brilliant blue staining. Lanes: 1: AvdpI9.4; 2: 
AvdpI9.0; 3: AvdpI7.9; 4: AvdpI7.2; 5: AvdpI5.9; 6: AvdpI4.7; 
M: pi standard. 

showed that all the mutants behaved similarly to native avidin 
(56.6 kDa) forming stable tetramers (the molecular masses of 
different mutants varied from 55.3 to 60.5 kDa). 

3.4. Bio tin-binding characteristics of avidin mutants 

All six charge mutants bound strongly to 2-iminobiotin, 
since they were efficiently affinity purified in a single step using 
this moiety as a capturing ligand. To further characterize the 
biotin-binding affinities of different mutants, an optical bio- 
sensor instrument (IASyS Manual+) was used. When at- 
tempts were made to measure the binding constants of mutant 
proteins to immobilized biotin, no dissociation was observed 
(data not shown). This indicated that all the mutant proteins 
exhibited very high affinity constants for biotin - similar to 
the tenacious binding by the wild-type protein. To obtain 
precise information concerning the possible differences in af- 
finities among the various mutants, 2-iminobiotin was used as 
a ligand instead of biotin. Avidin binds 2-iminobiotin at ele- 
vated pH with a lower, readily measurable level of affinity. 
The analysis of the binding curves of the wild-type avidin and 
different reduced charge mutants, each in several concen- 
trations, were used to determine their dissociation constants 
to 2-iminobiotin at pH 9.5 (Fig. 2). The calculated Ka = 
2.0X1CT 8 M for avidin is in good agreement with that 
reported previously by Green [6]. The values for different 
mutants varied from 1.4X10 -7 M to 3.5 X 10 -8 M (Table 1), 
suggesting that their biotin-binding properties are quite 
similar to that of wild-type avidin. 

3.5. Thermal stability of avidin mutants 

In order to determine whether the mutations affected the 
thermal stability of the avidin tetramer, the avidin mutants 
and native avidin were diluted in SDS-containing buffer and 
heated to temperatures between 25°C and 100°C in the ab- 
sence and presence of biotin. It has been shown previously 
that under the conditions of this assay biotin stabilizes the 
quaternary structure of avidin [18]. The transition tempera- 



tures for biotin-saturated mutants should therefore be higher 
than those of the biotin-free derivatives. Biotin-free native 
avidin starts to dissociate into monomers at temperatures 
over 57°C (Fig. 3A), whereas with biotin the transition begins 
only at temperatures near 100°C (Fig. 3B). The results of the 
thermostability experiment showed that of all the mutant avi- 
dins AvdpI4.7 (Fig. 3C and D) was the most stable. Surpris- 
ingly, this mutant appeared to be slightly more stable than 
wild-type avidin. AvdpI9.4 and AvdpI9.0 showed dissociation 
profiles similar to that of native avidin, but biotin-free Avd- 
pI7.9, AvdpI7.2 and AvdpI5.9 appeared to dissociate, pre- 
sumably into monomers, at room temperatures when SDS 
was present (data not shown). Interestingly, in the presence 
of biotin they behaved in a manner similar to that of native 
avidin, requiring temperatures near 100°C to dissociate. 

3.6. Non-specific binding to DNA 

A dot-blot assay was used to investigate the status of the 
charge mutations with respect to the non-specific binding 
properties of avidin (data not shown). Wild-type avidin bound 
strongly to both single- and double-stranded DNA. Most of 
this binding seems to be charge-related, since there was a 
correlation between lowering the isoelectric point and reduced 
binding to DNA; AvdpI4.7 showed the lowest levels of bind- 
ing to DNA. 

4. Discussion 

During the last two decades, avidin has evolved into a key 
component in many biotechnological applications such as af- 
finity-based separations and diagnostic assays [1], Applica- 
tions of avidin-biotin technology utilize the extraordinary af- 
finity avidin has for biotin. While the usefulness of avidin is 
impressive, there are some drawbacks associated with its uti- 
lization in some applications. Most of these problems are 
related to the high pi of avidin. For example the alkaline 
nature of avidin can cause charge-related binding to DNA 
and cell surfaces, which may hinder its use in certain circum- 



1 ■ 




Concentration (x 1 0- 7 M) 

Fig. 2. Interaction of native avidin (closed circles) and AvpI4.7 
(open circles) with 2-iminobiotin. Various concentrations of avidin 
or mutant were added to 2-iminobiou"n-coated cuvettes, and binding 
was measured in pH 9.5 buffer at room temperature using an IASyS 
optical biosensor. The equilibrium response (R^) is plotted vs. pro- 
tein concentration. The K d of the protein is equal to the concentra- 
tion at 
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Temperature ( b C) 

Fig. 3. Temperature-dependent dissociation of native avidin, AvdpI4.7 and their complexes with biotin. Samples of biotin-free and biotin-satu- 
rated avidin or AvdpI4.7 were combined with sample buffer and incubated for 20 min at the designated temperatures. The samples were then 
subjected to SDS-PAGE in 15% separating gels, and the gels were stained using Coomassie brilliant blue. Note that AvdpI4.7 tetramers (C and 
D, thin arrow) penetrated the upper gel, whereas those of the native avidin (A and B, thick arrow) failed to do so. Densitometry tracings from 
each of the gels were graphed as a function of temperature: A: avidin; B: avidin+ biotin; C: AvdpI4.7; D: AvdpI4. 7+ biotin. 



stances. The high pi of avidin has also been a detriment to its 
use in affinity-based drug targeting, since the positive charge 
of avidin is considered to be one of the major reasons for its 
rapid removal from the circulation system [3,19-21], 

Site-directed mutagenesis is currently the method of choice 
to modify properties of a protein or to study the connection 
between its structure and function. Successful protein engi- 
neering requires understanding of the basic concepts of pro- 
tein biosynthesis and structure. In the case of avidin, the 
availability of crystallographic data [11,12] together with the 
sequence information from avidin-related proteins [10] and 



streptavidin [9] have made it possible to design such changes 
in an intelligent manner. 

In the present work, we wanted to study whether the high 
pi of avidin can be reduced without losing its stability or high 
affinity for biotin. For this purpose we constructed a set of 
reduced charge mutants with isoelectric points ranging from 
9.4 down to 4.7. Most of these mutations are based on natural 
selection and evolution, since they were designed from se- 
quence-based comparison with streptavidin and avidin-related 
proteins. Therefore Arg-2 was changed to Ala and Lys-3 to 
Glu according to the sequence of streptavidin. Lys-9 was re- 
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placed with Glu according to the sequence of the avidin-re- 
lated protein, AVR1. Arg-26 was modified to Asn according 
to AVR1 and AVR2, and Arg-59 to Ala, since alanine ap- 
pears at this position in every AVR protein. In AvdpI7.2, Lys- 
9 was replaced by Asp instead of Glu due to a PCR error. On 
the other hand, Arg-122 and Arg-124 were both altered to 
Ala, because these C-terminal residues were not observed in 
the crystal structure of avidin [1 1]. 

The biotin-binding properties of all charge mutants were 
similar to those of native avidin, according to the biosensor 
data. The actual affinities for biotin could not be determined, 
since it was essentially impossible to remove the bound pro- 
teins from the biotin-derivatized cuvette. This was not surpris- 
ing, since the current biosensor technology can be used to 
determine affinities only up to K & ~ 10 12 M _I . We therefore 
decided to use 2-iminobiotin as a ligand. Chilkoti et al. [22] 
have previously demonstrated 2-iminobiotin to be a good re- 
porter for intrinsic streptavidin-biotin interactions and the 
same is likely to hold true for avidin. The affinities of mutants 
for 2-iminobiotin (Table 1) were comparable to native avidin, 
which suggested that their affinities for biotin were also sim- 
ilar (AT a s probably in the neighborhood of 10 13 -10 15 M -1 ). 
This was not so surprising because all the changed amino 
acids were on the surface of the protein. In this regard, it 
has been shown previously that, at least in the case of lyso- 
zyme, the structural changes resulting from mutations of sur- 
face residues are smaller and localized near the mutation, 
whereas those involving buried residues are larger and may 
be transmitted to other parts of the protein [23]. 

As reported earlier [18], native avidin failed to penetrate the 
separating gel during non-denaturing SDS-PAGE and re- 
mained in the aggregated state in the stacking gel (Fig. 3A, 
arrow). This phenomenon was attributed to the high pi of 
avidin and possible strong electrostatic interactions with 
SDS. It is thus interesting to note that, under similar non- 
denaturing conditions, AvdpI4.7 (Fig. 3Q and most of the 
other reduced charge mutants migrated in the separating gel 
in a manner consistent with that of a tetramer. 

When the thermal stability of the charge mutants was 
studied, it was observed that AvdpI9.4, AvdpI9.0 and Avd- 
pI4.7 displayed very similar dissociation profiles to that of the 
wild-type avidin. In contrast, AvdpI7.9, AvdpI7.2 and Avd- 
pI5.9 showed some differences. In the absence of biotin, the 
latter derivatives tended to dissociate into monomers already 
at room temperatures when SDS was present. Interestingly, all 
of the mutants were stabilized by the presence of biotin; the 
same mutants required temperatures up to 100°C to undergo 
denaturation as does wild-type avidin. AvdpI7.9, AvdpI7.2 
and AvdpI5.9 all have mutations on Lys-3 and Lys-9 which 
suggests that either of these lysines (or both) could have some 
role in the stability of the avidin tetramer. On the other hand, 
AvdpI4.7 also bears both of these mutations, yet it seemed to 
be even slightly more stable than wild- type avidin. One pos- 
sibility is that the additional mutations in AvdpI4.7 (Arg-26 
to Asn-26 and Arg-59 to Ala-59) somehow counteract the 
effect of mutations on Lys-3/Lys-9. The reasons for this enig- 
ma remain to be investigated further. 

In summary, we have constructed a progressive series of 
reduced charge mutants of the egg-white protein avidin, 
with p/s ranging from 9.4 to 4.7. All these mutant proteins 



bound biotin very strongly, and their thermostability was, in 
most cases, comparable to that of the wild-type avidin. The 
mutants also displayed reduced non-specific binding charac- 
teristics to DNA when compared to wild-type avidin. Our 
results demonstrate that the charge properties of avidin can 
be modified by protein engineering without disturbing the 
crucial biotin-binding activity. These modified avidins (espe- 
cially AvdpI4.7) should also prove to be valuable in many 
applications of avidin-biotin technology, including affinity- 
based drug targeting and different separation technologies. 
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Recombinant Core Streptavidins 

A MINIMUM-SIZED CORE STREPTAVIDIN HAS ENHANCED STRUCTURAL STABILITY AND HIGHER 
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Takeshi Sano, Mark W. Pandori*. Xiaomin Chen§ ( Cassandra L. Smith, and Charles R. Cantort 

From the Center for Advanced Biotechnology and Departments of Biomedical Engineering and Pharmacology, Boston 
University. Boston, Massachusetts 02215 and the ^Department of Molecular and Cell Biology, University of California, 
Berkeley, California 94720 



Two recombinant core streptavidins were designed 
and characterized to understand the role of the terminal 
sequences, present in naturally truncated core strept- 
avidins, in the properties of streptavidin. One recombi- 
nant core streptavidin, Stv-25, has an amino acid se- 
quence very similar to natural core streptavidins. The 
other recombinant molecule, Stv-13, has further trunca- 
tion of the terminal residues and consists essentially of 
only the 0-barrel structure characteristic of streptavi- 
din. These recombinant core streptavidins are tet- 
rameric and bind four biotins/molecule, as does natural 
streptavidin. The solubility characteristics of Stv-13, de- 
termined by varying the concentration of ammonium 
sulfate or ethanol, were almost the same as those of 
Stv-25 and natural core streptavidin. However, Stv-13 
showed an enhanced structural stability compared with 
Stv-25 and natural core streptavidin. For example, 
Stv-13 retained greater than 80% of its blotin binding 
ability after incubation in 6 m guanidine hydrochloride 
at pH 1.5, under which conditions, Stv-25 and natural 
core streptavidin retained only about 20% of their biotin 
binding ability. In addition, Stv-13 showed higher acces- 
sibility to biotinylated DNA than natural core streptavi- 
din. Apparently, the terminal regions, present on the 
surface of natural core streptavidin, spatially hinder 
biotinylated macromolecules from approaching the bio- 
tin binding sites. 



Streptavidin, a protein produced by Streptomyces avidinii, 
binds D-biotin with a remarkably high affinity {K d — 10~ 15 m) 
(1-4). This extremely tight biotin binding affinity has made the 
streptavidin-biotin system a powerful biological tool in a vari- 
ety of bioanalytical applications (5, 6). Streptavidin is generally 
isolated from culture media of S. avidinii. Such streptavidin 
molecules usually have truncated terminal sequences due to 
postsecretory c 1 eavage~ of~tne~terminal regions; " which are 
highly susceptible to proteolysis (4, 7-10). Nontruncated or 
only partially truncated streptavidins tend to form higher or- 
der aggregates and thus have poor solubility. In contrast, fully 
truncated streptavidin, termed natural core streptavidin, is 
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free from aggregate formation and shows high solubility. How- 
ever, the terminal sequences of natural core streptavidin often 
differ from preparation to preparation, and this heterogeneity 
can be seen even within single tetrameric molecules (9). Thus, 
many commercial preparations are treated with proteinases to 
further truncate the terminal sequences and to maximize the 
homogeneity of the terminal structure. 

In addition to numerous biotechnological applications, 
streptavidin generates considerable protein chemical interest, 
particularly as an attractive model for studying macromole- 
cule ligand interactions (11-18). The determination of the 
three-dimensional structure of core streptavidin by x-ray crys- 
tallography (19, 20) considerably expanded the understanding 
of the structural characteristics of this protein at the molecular 
level. However, no precise information about the structure of 
the chain termini was obtained in these studies because of the 
weak densities seen for these regions in the electron density 
maps. This indicates that the terminal regions, located on the 
surface of the molecule, are rather disordered and flexible (20) 
and that the terminal sequences have little contribution to the 
fundamental properties of streptavidin, such as formation of 
the stable /5-barrel structure and biotin binding. Apparently, 
these disordered structures are also responsible for the high 
proteinase susceptibility of the terminal regions. 

The objective of the present work was to produce, by genetic 
engineering, core streptavidin that has a homogeneous struc- 
ture. Such structurally homogeneous streptavidin molecules 
should be very useful in obtaining deeper understanding of the 
properties and structural characteristics of streptavidin. We 
were particularly interested in designing a minimum sized core 
streptavidin, which might have enhanced properties over nat- 
ural core streptavidins due to the lack of nonfunctional termi- 
nal residues. In this work, two recombinant core streptavidins 
were designed and produced. One recombinant core streptavi- 
din has a structure very similar to natural core streptavidins; 
the* otheFhas furtliei^tm the - terminal sequencesr 
which have no apparent function. These recombinant core 
streptavidins were characterized to understand the roles of the 
terminal regions in the properties of streptavidin. 

EXPERIMENTAL PROCEDURES 

Construction of Expression Vectors— Expression vectors for recombi- 
nant core streptavidins were constructed using standard techniques 
(21). Oligonucleotide-directed in vitro mutagenesis (22) was used to 
introduce mutations into the coding sequence for streptavidin. 

Two expression vectors, pTSA-13 and pTSA-25 (Fig. 1), were con- 
structed using a cloned natural streptavidin gene (7) as the starting 
material. pTSA-13 carries a DNA sequence encoding amino acid resi- 
dues 16-133 of mature streptavidin (7), while pTSA-25 encodes amino 
acid residues 14-138. The coding sequences were cloned into pET-3a 
(23, 24) under the control of the * 10 promoter of bacteriophage T7. 

Expression and Purification of Recombinant Core Streptavidins— 
Expression of each recombinant core streptavidin was carried out by the 
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Fig. I. Schematic illustration of the 
structures of various streptavidin 
constructs. The amino add sequence is 
based on Argarana et al. (7). Single-letter 
amino add codes are used to indicate ter- 
minal sequences. A box represents the se- 
quence from Thr-20 to Phe-130. Stv-13 
and Stv 25 are recombinant core strept- 
avidins designed in this work. The struc- 
ture of natural core streptavidin. obtained 
from Boehringer Mannheim, is from 
Bayer et al. (10). 
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T7 expression system using BL21 (DE3)(pLysE) (24) carrying an expres- 
sion vector, as described previously (25-27). 

Purification of Stv-13 and Stv-25 was carried out by the method 
described previously (25-27). including 2-iminobiotin affinity chroma- 
tography (28). BL21(DE3)(pLysE) carrying pTSA-13 or pTSA-25, which 
had been incubated for 4 h after induction, was used as the source. 

Determination of Solubility Characteristics— The solubility charac- 
teristics of recombinant core streptavidins. Stv-13 and Stv-25. without 
or with biotin were determined by varying concentrations of ammonium 
sulfate or ethanol. Natural core streptavidin (Boehringer Mannheim) 
was also analyzed for comparison. 

For analysis in the absence of biotin. the concentration of each core 
streptavidin was adjusted to 5.7 nmol of subunit/ml in Tris-buffered 
saline (150 mM NaCl, 20 mM Tris-Cl (pH 7.4). 0.02% NaN-j). This 
corresponds to 72 ^g/ml for Stv-13, and 76 /ig/ml for Stv-25 and natural 
core streptavidin. To 100 /il of this protein solution, 1.1 ml of an 
appropriate ammonium sulfate solution in Tris-buffered saline was 
added to adjust the final concentration of ammonium sulfate (final 
streptavidin concentration. 0.48 nmol of subunit/ml). The mixture was 
allowed to stand at 30 °C for 30 min and centrifuged at 2.200 x ^for 20 
min. The amount of soluble streptavidin in the supernatant fraction 
was determined by biotin binding assays described below. The fraction 
of original streptavidin remaining in the supernatant is defined as the 
relative solubility. 

For analysis in the presence of biotin. the procedure was almost the 
same as above, but the biotin binding sites of each core streptavidin 
were saturated by adding an equhnolar amount of D-[carbonyJ- i4 C]bi- 
otin (53 mCi/mmol; Amersham Corp.) prior to the addition of an am- 
monium sulfate solution. The amount of soluble streptavidin in the 
final supernatant was estimated from the radioactivity derived from 
bound biotin. determined by liquid scintillation counting. 

When ethanol was used, the procedures were essentially the same as 
those used with ammonium sulfate, but the following modifications 
were made. After the addition of ethanol, the final volume was adjusted 
to 1 .2 ml by the addition of an appropriate ethanol solution to make the 
final protein concentration constant for all samples. After incubation at 
30 °C for 30 min, centrifiigation was performed at 13,000 X g for 20 
min. 

StabUlty^gaihs^ Denaturation by GdnHCl 1 —The structural stability - 
of core streptavidins was estimated from the biotin binding ability after 
incubation in GdnHCl solutions at pH 7.4 or pH 1.5. Each of Stv-13. 
Stv-25. and natural core streptavidin was incubated at 22 °C for 10 min 
in 500 *U of an appropriate GdnHCl solution (final GdnHCl concentra- 
tion. 0-6.0 m) at a protein concentration of 270 pmol subunits/ml (1.7 
Mg/ml for Stv-13 and 1.8 Mg/ml for Stv-25 and natural core streptavi- 
din). Then. 1.4 jtl (680 pmol) of D-lcarbonyl- 1 4 C]biothi was added to each 
solution. The mixture was incubated at 22 *C for 10 min. and strept- 
avidln-biotin complexes were separated from free unbound biotin using 
PD-10 columns (Pharmacia Biotech Inc.). which had been equilibrated 
with the same GdnHCl solution. The amount of radioactive biotin 
remaining bound to streptavidin was determined by liquid scintillation 
counting. 

Binding Ability for Biotinylated DNA— The binding ability of two 
core streptavidin spedes, Stv-13 and natural core streptavidin, for 
biotinylated DNA was determined by using a 3179-base pair linear 

1 The abbreviations used are: GdnHCl, guanidine hydrochloride; 
PAGE, polyacryl amide gel electrophoresis. 



double-stranded DNA target in which one of the 3' termini contains 
biotin. This target DNA was prepared by using an Acd-Hindlll frag- 
ment of the plasmid pGEM-3Zf(+) (Promega). Biotin was incorporated 
into the Hindlll terminus by fill-in reactions in the presence of a 
biotinylated deoxynucleotide analog, biotin- 1 4 -dATP (Life Technolo- 
gies. Inc.), as described earlier (29). Core streptavidin and the biotiny- 
lated target DNA were mixed at various ratios in 10 mM Tris-Cl (pH 
8.0), 0.1 mM EDTA. and the mixtures were incubated at 37 °C for 90 
min followed by electrophoretic separation on 1% agarose gels. DNA 
was stained with ethidium bromide. 

Other Methods— Gel filtration chromatography was carried out at 
room temperature (22 °C) using a Sephacryl S-300 HR column (1.6 x 85 
cm; Pharmada), as described previously (26. 30). Biotin binding ability 
was determined by gel filtration (31) using D-[carbonyP % 4 C)bioUn and 
PD-10 columns. SDS-PAGE (32) was carried out using 15% polyacryl- 
amide gels. Proteins were stained with Coomassie Brilliant Blue R-250. 
The concentration of each streptavidin preparation was determined 
from the absorbance at 280 nm using the following extinction coeffi- 
cients (£££*nJ: Stv-13, 3.55; Stv-25, 3.35; natural core streptavidin. 
3.35 (33). 

RESULTS AND DISCUSSION 
Design of Recombinant Core Strepta wc/iras— Although ma- 
ture streptavidin has 159 amino acids/subunit (7), such full- 
length, nontruncated molecules can rarely be seen under the 
conditions generally used for the culture of S. avidinii. This is 
due to very high susceptibility of the terminal regions of the 
full-length molecule to proteolysis. Such full-length and only 
partially truncated molecules tend to form higher-order aggre- 
gates and have poor solubility (4, 7-10). although the biological 
reason why S. avidinii produces streptavidin with poor solu- 
bility and a tendency to aggregate is unknown. For these rea- 
sons, such streptavidins are not useful in bioanalytical appli- 
cations. In contrast, fully truncated core streptavidins have 
high solubility and show little tendency to aggregate (4. 7-10). 
Thus, many commercial streptavidin preparations include pn>— 
teinase treatment to ensure full truncation of the terminal 
sequences. However, the variable performance seen for natural 
core streptavidin preparations may be attributable to incom- 
plete truncation of the tenninal sequences and residual proteo- 
lytic activity, resulting from proteinase treatment (34). 

The objective of the present work was to design recombinant 
core streptavidin with a homogeneous structure. A particular 
motivation was derived from the fact that the structural het- ' 
erogeneity of natural core streptavidins reduces the resolution 
obtainable in x-ray diffraction studies on streptavidin crystals 
(19. 20). We were particularly interested in designing a mini- 
mum sized core streptavidin. which might have enhanced prop- 
erties because of the removal of any nonfunctional terminal 
sequences that are located on the surface of the molecules. 

In this work, two recombinant core streptavidins were de- 
signed (Fig- 1)- One recombinant core streptavidin, Stv-25, has 
amino acid residues from Glu-14 to Ala- 138 plus a methionine 
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Fig 2 SDS-PAGE analysis of purified core streptavidins. Lane 
a, Stv-13; lane b, Stv-25; lane c natural core streptavidin; lane d, 
molecular mass standard proteins (Pharmacia). Approximately 2 /ig of 
protein was applied to each lane of a 15% polyacrylamide gel. Proteins 
were stained with Coomassie Brilliant Blue. 



residue at the N terminus, derived from a translation initiation 
codon, and thus has a structure very similar to natural core 
streptavidins. Other groups have also produced recombinant 
core streptavidins (34-36), which are very similar to Stv-25. 

The other recombinant core streptavidin, Stv-13, has further 
truncation of the terminal sequences and consists of amino acid 
residues from Gly-16 to Val-133 plus a methionine residue at 
the N terminus. Previous crystallographic studies on strept- 
avidin using natural core streptavidin were able to refine the 
molecular structure only from Ala- 13 or Glu-14 to Val-133 (19. 
20), which corresponds almost perfectly to the stable ^-barrel 
structure consisting of the sequence from Gly-19 to Val-133. 
This implies that the terminal regions of natural core strept- 
avidins have little contribution to the fundamental properties 
of streptavidin. which should not be altered by the further 
truncation of the terminal sequences made on Stv-13. 

Expression and Purification of Recombinant Core Streptavi- 
dins— Expression of Stv-13 and Stv-25 was carried out using 
the T7 expression system, which allows efficient expression of 
various recombinant streptavidin constructs (25-27. 33-38). 
Stv-13 was expressed very efficiently in Escherichia coli as 
were other recombinant streptavidin constructs (25-27. 37). In 
contrast, the expression efficiency of Stv-25 was considerably 
lower. This is probably caused by codons for the terminal se- 
quences present-in Stv-25 (but absent in Stv-13) that- oeeur-at- 
low frequencies in highly expressed E. coli genes. A similar 
observation was reported with another recombinant core 
streptavidin (35). where the expression efficiency in E. coli was 
rather low with the natural streptavidin gene but significantly 
improved by using a synthetic gene containing codons observed 
in highly expressed E. coli genes. 

Expressed Stv-13 and Stv-25 were purified to homogeneity 
(Fig. 2) using a simple procedure that includes 2-iminobiotin 
affinity chromatography. SDS-PAGE analysis of purified pro- 
teins shows a clear difference in subunit molecular mass (650 
Da) between Stv-13 and Stv-25. Natural core streptavidin ob- 
tained from Boehringer Mannheim also showed a single band 
on SDS-PAGE. and its migration was very similar to that of 
Stv-25. Although no terminal sequences were determined on 
the particular batch, this natural core streptavidin is likely to 
consist of amino acid residues 13-139, as shown by the termi- 
nal sequence analysis of the protein obtained from the same 



source (10), because the identity of the terminal residues is 
determined primarily by the proteinase treatment used. 

Each of Stv-13. Stv-25. and natural core streptavidin bound 
greater than 0.96 molecules of biotin per subunit. indicating 
that these molecules have full biotin binding ability. Gel filtra- 
tion chromatography using Sephacryl S-300HR showed that 
each of these core streptavidins is tetrameric and free from 
aggregate formation. 

Solubility Characteristics of Core Streptavidins— Significant 
differences are observed in solubility characteristics of core 
streptavidin and full-length or only partially truncated strept- 
avidin (4. 7-10). These differences suggest that the terminal 
regions primarily determine the solubility characteristics of 
streptavidin. To understand the effect of terminal sequences 
remaining in natural core streptavidin on the solubility char- 
acteristics, the solubility of each core streptavidin species with 
and without biotin was investigated by varying the concentra- 
tion of ammonium sulfate or ethanol. 

The relative solubility of the three core streptavidins, Stv-13. 
Stv-25, and natural core streptavidin, as the concentration of 
ammonium sulfate was altered, showed biphasic changes (Fig. 
3, A and £); the solubility decreased sharply with increasing 
concentrations of ammonium sulfate up to 50% saturation and 
then increased with further increases in ammonium sulfate 
concentration. In the absence of biotin, Stv-13 showed slightly 
lower solubility than Stv-25 and natural core streptavidin at 
ammonium sulfate concentrations up to 50% saturation, but 
Stv-13 had the highest solubility at 90% saturation of ammo- 
nium sulfate. Biotin binding slightly increased the solubility of 
the core streptavidins. Similar to the solubility changes with- 
out biotin, Stv-13 showed slightly lower solubility than Stv-25 
and natural core streptavidin at ammonium sulfate concentra- 
tions up to 50% saturation but had the highest solubility at 70 
and 90% saturation in the presence of biotin. 

The three core streptavidins showed high solubility (greater 
than 75%) at ethanol concentrations up to 70% in the absence 
of biotin (Fig. 3Q. At an ethanol concentration of 90%, only 
about 30% of molecules remained soluble for all of the three 
core streptavidins. Biotin binding had a slight effect on the 
solubility (Fig. 3Z}. There is no marked difference in the solu- 
bility characteristics in ethanol among the three core strept- 
avidin species. 

Although Stv-13 lacks two charged amino acid residues, 
Glu-14 and Lys-134, and two polar residues, Ser-136 and Ser- 
139. there is no significant difference in solubility characteris- 
tics among the three core streptavidin species. Although the 
relative solubilities, determined for each core streptavidin con- 
struct, may not indicate the true solubilities because the pro- 

tein solutions_may„not_have„reached_equilibrium_under„the 

incubation conditions used, these results clearly indicate that 
the terminal regions of core streptavidins have minimal effects 
on the solubility characteristics, unlike those of full-length 
streptavidin. 

Structural Stability of Core Streptavidins- One characteris- 
tic that has made streptavidin one of the most frequently used 
proteins in various biological analyses is its extremely high 
structural stability. This allows, for example, conjugation of 
streptavidin to partner molecules by using covalent chemistry 
without disturbing its biotin binding ability. Very tight subunit 
association of streptavidin also contributes to the overall sta- 
bility For example, streptavidin remains tetrameric even in 
the presence of SDS or urea (33. 39. 40). The subunit associa- 
tion becomes even tighter upon biotin binding, and the tet- 
rameric structure can be partly maintained by heat treatment 
in the presence of SDS. under which conditions, streptavidin 
without biotin dissociates completely into subunits (33). This 
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tighter subunit association upon biotin binding is essential for 
maintenance "of bound- biotin~because~dissocia ted subunits 
have a much reduced biotin binding affinity due to the lack of 
intersubunit contacts made by Trp-120 to biotin through the 
dimer-dimer interface (36. 37). Extremely harsh conditions are 
required to effectively release bound biotin from streptavidin. 
Although the known three-dimensional structure of core 
streptavidin shows no apparent contact of the terminal resi- 
dues to biotin, the disordered structure of the terminal regions 
(19. 20) may affect the overall stability of the protein. 

To investigate how the terminal regions affect the overall 
stabihty of streptavidin, urea gradient-PAGE (41) was per- 
formed using polyacrylamide gels with a urea concentration 
gradient from 0 to 10 m. along with an acrylamide concentra- 
tion gradient from 12 to 8%. As a control (41), urea gradient- 
PAGE analysis showed a marked decrease in migration of 
bovine serum albumin at high urea concentrations (data not 
shown), indicating the unfolding of the molecule. However, the 
three core streptavidins, Stv-13. Stv-25, and natural core 



streptavidin. showed no appreciable changes in migration at 
~ urea-concentrations up to- 1 0 Mrindicating- the extremely high- 
structural stability of streptavidin. This suggested that more 
stringent denaturation conditions are needed to allow a com- 
parison of the stability of core streptavidins. 

Thus, GdnHCl, a denaturant more potent than urea, was 
used. At high concentrations and very acidic pH. GdnHCl ef- 
fectively denature streptavidin and release bound biotin. Bi- 
otin-binding ability was used as an estimate of the structural 
stability. Briefly, each core streptavidin species was incubated 
in solutions containing various concentrations of GdnHCl at 
pH 7.4 or 1 .5, and then the remaining biotin binding ability was 
determined by gel filtration (31) (Fig. 4, A and £). 

At pH 7.4, almost no changes in biotin binding ability were 
observed for all of the core streptavidins at GdnHCl concentra- 
tions up to 4 m (Fig. 4A). At 6 m GdnHCl, the biotin binding 
ability of Stv-25 and natural core streptavidin decreased by 
approximately 20%. while Stv-13 showed almost no reduction 
in biotin binding ability, suggesting that Stv-13 has a higher 
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Fig. 4. Stability of core streptavidins against denaturation by GdnHCL •. Stv- 1 3; O, Stv-25; , natural core streptavidin. Stv-1 3. Stv-25. 
and natural core streptavidin (270 pmol of subunit/ml) were incubated at 22 °C for 10 min in GdnHCl solutions at pH 7.4 (4) or 1.5 Then, the 
biotin binding ability of each core streptavidin was determined by gel filtration (31). All three core streptavidins bound greater than 0.96 molecules 
of biotin/subunit at pH 7.4 without GdnHCl, and the biotin binding ability remaining in GdnHCl is indicated in percent. 



stability against denaturation by GdnHCl than Stv-25 and 
natural core streptavidin. 

The enhanced structural stability of Stv- 13 over Stv-25 and 
natural core streptavidin was observed even more clearly at pH 
1.5 (Fig. 4£). Stv- 13 retained almost full biotin binding ability 
at GdnHCl concentrations up to 4 m. In contrast, Stv-25 and 
natural core streptavidin lost approximately 15% of the biotin 
binding ability at 4 m GdnHCL At 6 m GdnHCl, Stv- 13 retained 
greater than 80% of the biotin binding ability, while only about 
20% of the biotin binding ability was retained with both Stv-25 
and natural core streptavidin. 

These results demonstrate that Stv- 13 has an enhanced sta- 
bility against denaturation by GdnHCl when compared with 
Stv-25 and natural core streptavidin. This implies that the 
terminal regions reduce the overall structural stability of 
streptavidin. 

Ability of Core Streptavidins to Bind Biotinylated Macromol- 
ecules— Full-length or only partially truncated streptavidin 
has a lower accessibility to biotinylated macromolecules than 
natural core streptavidins (10), because of steric hindrance 
caused by the terminal regions located on the surface of the 
molecule. To estimate how the terminal sequences of core 
streptavidin affect the binding to biotinylated macromolecules, 
the biotinylated DNA-binding ability of two core streptavidin 
species, Stv- 13 and natural core streptavidin, was investigated. 
Brieflyran end-biotinylated double-stranded-DNA-target was- 
mixed with core streptavidins at various ratios, and the mix- 
tures were separated by agarose gel electrophoresis followed by 
staining the DNA targets with ethidium bromide. 

Gel electrophoretic analysis of core streptavidin-biotinylated 
DNA mixtures (Fig. 5) shows that larger amounts of dimeric 
and trimeric biotinylated DNA targets, which are connected via 
single streptavidin molecules, were formed with Stv- 13 than 
with natural core streptavidin at any molar ratio of streptavi- 
din subunit to biotin used. Correspondingly, smaller amounts 
of DNA targets without streptavidin or with single streptavidin 
molecules (only slightly retarded from free DNA targets that 
were not well resolved under the electrophoresis conditions 
used) were observed with Stv- 13. Although this analysis is not 
quantitative, the result indicates that Stv- 13 has an enhanced 
binding ability for biotinylated DNA over natural core strept- 
avidin. The enhanced binding ability of Stv- 13 for biotinylated 
DNA reveals that the terminal regions, present on the surface 
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Fig. 5. Ability of core streptavidins to bind biotinylated DNA. 

A 3179-base pair end-biotinylated double-stranded DNA target was 
mixed with Stv- 13 or natural core streptavidin at various ratios and 
incubated at 37 °C for 90 min. Then, the mixtures were electrophoresed 
on a 1.0% agarose gel and the DNA targets were stained with ethidium 
bromide. The left and right lanes for each molar ratio of streptavidin 
subunit to biotin from 0.25 to 1.0 are with Stv- 13 and natural core 
streptavidin, respectively. The lanes marked 0 and Ex are with the 
biotinylated DNA target alone and with an excess amount of natural 
core streptavidin (molar ratio of streptavidin subunit to biotin —1,000), 
_ respectively. Each . lane contains. _290ng of the biot inylat ed D N A tar g et. 
The lane marked Af contains a I-kilobase pair DNA ladder (Life Tech- 
nologies, Inc.). 

of natural core streptavidin, sterically hinder the biotin binding 
sites and prevent biotinylated macromolecules from approach- 
ing the biotin binding sites due, presumably, to their disor- 
dered structure. 
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The intracellular compartmentation of biotin holocarboxylase 
synthetase has been investigated in pea (Pisum sativum) leaves, 
by isolation of organelles and fractionation of protoplasts. 
Enzyme activity was mainly located in cytosol (approx. 90% of 
total cellular activity). Significant activity was also identified in 
the soluble phase of both mitochondria and chloroplasts. Two 
enzyme forms were separated by anion-exchange chromato- 
graphy. The major form was found to be specific for the cytosol 
compartment, whereas the minor form was present in mito- 
chondria as well as in chloroplasts. We also report the isolation 
and DNA sequence of a cDNA encoding an Arabidopsis thaliana 
biotin holocarboxylase synthetase. This cDNA was isolated by 
functional complementation of a conditional lethal Escherichia 



coli birA (biotin ligase gene, which regulates biotin synthesis) 
mutant. This indicated that the recombinant plant protein was 
able to biotinylate specifically an essential apoprotein substrate 
in the bacterial host, that is a subunit of acetyl-CoA carboxylase 
called biotin carboxyl carrier protein. The full-length nucleotide 
sequence (1534 bp) encodes a protein of 367 amino acid residues 
with a molecular mass of 41 172 Da and shows specific regions of 
similarity to other biotin holocarboxylase synthetase genes as 
isolated from bacteria and yeast, and with cDNA species from 
human. A sequence downstream of the first translation initiation 
site encodes a putative peptide structurally similar to organelle- 
targeting pre-sequences, suggesting a mitochondrial or chloro- 
plastic localization for this isoform. 



INTRODUCTION 

Biotin is a small coenzyme (vitamin H or BJ, synthesized by 
plants, most bacteria and some fungi, which occurs primarily in 
a protein-bound state within the cell. Biotinylated proteins use 
this prosthetic group as a carrier of activated carboxy groups 
during carboxylation and decarboxylation enzymic reactions. In 
all organisms, these carboxylases play housekeeping functions, 
such as acetyl-CoA carboxylase (EC 6.4.1.2; ACCase), which 
catalyses the first committed step in fatty acid biosynthesis (for 
a general review see [1]). 

Escherichia coli contains only a single biotinylated protein 
called biotin carboxyl carrier protein (BCCP), which functions as 
a subunit of ACCase. Biotinylation of apo-BCCP occurs through 
the action of a biotin ligase (EC 6.3.4.10). This enzyme catalyses 
the post-translational attachment of D-biotin to a specific Lys 
residue of newly synthesized apo-BCCP, via an amide linkage 
between the biotin carboxyl group and a unique e-amino group 
of a Lys residue [2]. This covalent attachment, essential for the 
enzymic activation of ACCase, occurs in two distinct steps as 
follows: _-— — 

D-biotin + ATP D-biotinyl S'-AMP+PP, (1) 

D-biotinyl 5'- AM P + apo-BCCP — BCCP + AMP (2) 

Biotin ligase has been purified from E. coli and its gene cloned 
[3,4]. This enzyme, also called BirA, is a 33.5 kDa protein [4] that 
also acts as a repressor of the biotin operon [5J. Its three o - 
dimensional structure has recently been determined at 2.3 A 



resolution [6]. The complete nucleotide sequences of two other 
bacterial genes that encode proteins homologous with the E. coli 
biotin: apoprotein ligase have been reported from Paracoccus 
denitrificans [7] and Bacillus subtilis [8] respectively. Mammalian 
cells also contain biotin-dependent carboxylases that are localized 
in different cell compartments, i.e. ACCase in the cytosol, and 3- 
methylcrotonoyl-CoA carboxylase (EC 6.4. 1 .4), propionyl-CoA 
carboxylase (EC 6.4.1.3) and pyruvate carboxylase (EC 6.4.1.1) 
in mitochondria. As in bacteria, their activation from apo- to 
holo- (biotinylated) forms requires the action of a biotin ligase. 
Previous studies have demonstrated biotin ligase activity in both 
the cytosol and the mitochondria from mammalian cells [9,10], 
suggesting that biotinylation of biotin-dependent carboxylases 
occurs in their sites of enzymic activities. The corresponding 
enzymes from various mammalian species, which carry out the 
same reaction as bacterial biotin ligase, have been purified and 
were referred to as biotin holocarboxylase synthetase (HCS) 
[9,1 1], Recently, clones encoding Saccharomyces cerevisiae HCS 
gene [12] and human HCS cDNA species have been obtained 
. [1 3, 14]. Jn.plants, JbiQtin_and.biotinylate^ centra! 
role in metabolism. For example, a mutation in the biotin 
synthetic pathway is lethal for Arabidopsis thaliana [15], and 
plant ACCase is the target site of potent herbicides [16]. Biotin- 
dependent carboxylases are also present in different compart- 
ments of plant cells. As in bacteria and mammals, biotinylation 
of these enzymes is catalysed by HCS. Indeed, in a previous 
paper we provided the first direct evidence for the existence of 
HCS activity in plants [17]. In particular, we showed that the 



Abbreviations used: ACCase, acetyl-CoA carboxylase (EC 6.4.1.2); BCCP. biotin carboxyl carrier protein; DTT, dithiothreitol; GraPDH. glyceraldehyde 
3-phosphate dehydrogenase (NADP + -dependent) (EC 1.2.1.13); HCS, biotin holocarboxylase synthetase (EC 6.3.4.10); PFP. pyrophosphate: fructose- 
6-phosphate 1 -phosphotransferase (EC 2.7.1.90); TTC. 2,3.5-triphenyltetrazolium hydroxychloride. 

* To whom correspondence should be addressed. 

The nucleotide sequence ol A. thaliana biotin holocarboxylase synthetase cDNA will appear in DDBJ. EMBL and GenBank Nucleotide Sequence 
Databases under the accession number U41369. 
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partly purified enzyme, as obtained from pea leaves, was able to 
biotinylate specifically bacterial apo-BCCP as a substrate [1 7]. 
Nevertheless it is unknown whether plant biotin-dependent 
carboxylases are biotinylated within the cytosol and then trans- 
located into their final site of accumulation, or are targeted into 
organelles as apo-proteins and subsequently biotinylated. 

In this study we analysed purified chloroplasts and mito- 
chondria from pea leaves and also used protoplasts from pea 
leaves as a source of cytosol with low contamination. We 
conclude from these experiments that, in pea leaves, HCS activity 
can be detected in the cytosol, mitochondria and chloroplasts. 

In contrast, on the basis of the observation that plant HCS 
was able to biotinylate a bacterial substrate, we set out to clone 
a plant HCS cDNA by a functional complementation approach 
using an E. coli strain carrying a temperature-sensitive birA 
mutation and a plant cDNA expression library prepared from A. 
thalicma. This approach proved successful for isolating human 
HCS cDNA clones and yeast HCS gene [12,14], thus dem- 
onstrating the capacity of HCS from eukaryotes in replacing the 
biotin ligase function of E. coli BirA protein. By this method we 
have, in the present study, cloned and characterized for the first 
time a cDNA encoding HCS from the plant kingdom. 

MATERIALS AND METHODS 
Materials 

D-[8,9- 3 H]Biotin (42 Ci/mmol) and [^methionine (1000 Ci/ 
mmol) were purchased from Amersham. D-Biotin, carbeni- 
cillin, dithiothreitol (DTT), ATP, thiamin hydroxychloride 
and 2,3,5-triphenyltetrazolium hydroxychloride (TTC) were ob- 
tained from Sigma Chimie SARL. Isopropyl ^-D-thiogalactoside 
was obtained from Bioprobe Systems. Casein hydrolysate for 
vitamin-free assay was from Difco, and trichloroacetic acid was 
from Merck. 

Plant material 

Pea (Pisum sativum L., var. Douce Provence) plants were grown 
from seeds in soil for 8-10 days under a 12 h photoperiod of 
white light from fluorescent tubes (10-^0 /iE/s per m 2 ) at 18 °C. 
The plants were watered each day with tap water. 

Preparation of pea leaf crude extract 

Pea leaves (2 g) were harvested and ground in liquid nitrogen 
with a mortar and pestle. The powder was then homogenized 
with 10 ml of 20 mM Tris/HCl (pH 7.8), 1 mM EDTA, 1 mM 
DTT, 5 mM e-aminohexanoic acid and 1 mM benzamidine/HCl. 
The suspension was centrifuged at 72000 # for 30 min (50 Ti 
rotPX.Jiectanan). The_ supernatant comprised the crude extract. 
All procedures were performed at 4 °C. 

Preparation and fractionation of pea leaf protoplasts 

Pea leaf protoplasts were purified from young leaves (10 days 
old) by the method of Baldet et al. [18}. Pea leaves (10-15 g) were 
cut into fine strips (1 mm) and placed in a medium containing 
10 mM Mes/NaOH (pH 5.5)/0.5 M sorbitol/ 1 mM CaCl 8 / 
0.05 % (w/v) PVP-25 (buffer A). After vacuum infiltration, leaf 
strips were washed twice in buffer A, then placed at 25 °C for 
90 min in buffer A containing 2 % (w/v) cellulase Onozuka R10, 
0.5 % (w/v) Macerozyme R10 and 0.2 % (w/v) Pectolyase Y-23 
(Yakult Honsha Co., Shingikancho, Nishinomiya, Japan). Ail 
subsequent procedures were performed at 4 °C. Protoplasts were 
released from the digested tissues by gentle shaking and filtration 
through a 100 /zm nylon mesh. The filtrate was centrifuged at 



100^ for 5 min (swinging-bucket rotor). The pelleted protoplasts 
were resuspended in 50 ml of 0.5 M sorbitol/ 1 mM CadJ 
20 mM Tris/HCl (pH 7.8) (buffer B). Two additional washes 
were performed in buffer B before the protoplasts were finally 
resuspended in buffer B supplemented with 5 mM e-amino- 
hexanoic acid and 1 mM benzamidine/HCl, at a chlorophyll 
concentration of 0.1 mg/ml. Protoplasts were gently ruptured by 
being passed first through a 20 /tm nylon mesh and then through 
a 10 /tm nylon mesh. About one-tenth of the lysed protoplast 
suspension was used as the protoplast crude extract. The 
remaining protoplast lysate was centrifuged at 300 g for 5 min. 
The pelleted chloroplasts were suspended in buffer B containing 
5 mM e-aminohexanoic acid and 1 mM benzamidine/HCl. The 
supernatant was then centrifuged at 12000 g for 20 min, resulting 
in the mitochondrial (pellet) and cytosolic (supernatant) fractions. 
The mitochondrial pellet was added to the chloroplastic fraction 
to yield the organelle fraction. 



Preparation of purified chloroplasts 

Young pea leaves (9 days old) were homogenized in 330 mM 
sorbitol/50 mM Hepes/NaOH (pH8)/l mM EDTA/5 mM 
DTT with a Waring blender. Intact chloroplasts were rapidly 
prepared and purified with Percoll gradients as previously 
described [19]. The morphological integrity of purified chloro- 
plasts was of the order of 95% as judged by ferricyanide 
reduction by untreated and osmotically shocked organelles 
respectively [20]. Intact chloroplasts were lysed in a buffer 
containing 20 mM Tris/HCl (pH 7.8), 1 mM EDTA, 1 mM 
DTT, 5 mM e-aminohexanoic acid and 1 mM benzamidine/HCl 
at a final protein concentration of 40 mg/ml after one freeze- 
thawing cycle (5 min in liquid nitrogen, thawing at 25 °Q. 
The suspension of broken chloroplasts was centrifuged at 72000 g 
over a 0.6 M sucrose cushion for 20 min. The pellet and the 
supernatant comprised the chloroplast membranes (envelope 
membranes and thylakoids) and the soluble fraction (stroma) 
respectively. All procedures were performed at 4 °C. 

Preparation of purified mitochondria 

Mitochondria were isolated and purified from young pea leaves 
by using self-generating Percoll gradients as described by Douce 
et al. [21]. The morphological integrity of purified mitochondria 
was greater than 95% as determined by the rate of KCN- 
sensitive cytochrome c-dependent O z uptake in untreated and 
osmotically shocked mitochondria respectively [21]. Total lysis 
of mitochondria was achieved by three freeze-thawing cycles 
(5 min in liquid nitrogen, thawing at 25 °Q, performed in the 
same buffer-as that.used for the lysis of-purified chloroplasts„(see_ 
above), at a protein concentration of 40 mg/ml. The suspension 
of broken mitochondria was centrifuged at 72000 g over a 0.6 M 
sucrose cushion for 20 min. The pellet and the supernatant 
comprised the mitochondrial membranes and the soluble fraction 
(matrix) respectively. AH procedures were performed at 4 °C. 

Measurement of marker enzyme activities 

Except where otherwise noted, enzymes were assayed spectro- 
photometrically at 340 nm by coupling to oxido-reduction of 
NADH (or NADPH) in 1 ml reaction volumes. Triton X-100 
(0.05%, w/v) was added to each reaction mixture. The activity 
of each enzyme in pea leaf extract was strictly dependent on the 
presence of all necessary substrates and cofactors, and linear 
with respect to the amount of extract assayed. 
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Glyceraldehyde-3-phosphate dehydrogenase 

Glyceraldehyde-3-phosphate dehydrogenase (NAD P* -depen- 
dent) (EC 1.2.1.13; GrafDH) activity was measured by the 
conversion of 1 ,3-bisphosphoglycerate into glyceraldehyde 
3-phosphate [22]. The assay mixture contained 50 mM Hepes/ 
NaOH, pH 7.8, 1 mM MgCl 2 , 4 mM EDTA, 5 mM DTT, 
0.2 mM NADPH, 1 mM ATP, 10 units of phosphoglycerate 
kinase (EC 2.7.2.3), 10 units of triose-phosphate isomerase (EC 
5.3.1.1) and 10 mM 3-phosphoglycerate. 

Fumarase 

Activity was determined by the method of Hill and Bradshaw 
[23], by following the appearance of fumarate spectro- 
photometrically at 250 nm from reaction assays containing 
50 mM Hepes/NaOH, pH 8, and 50 mM malate. 

Pyrophosphate : fructose-6-phosphate 1 -phosphotransferase 

Pyrophosphate : fructose-6-phosphate 1 -phosphotransferase (EC 
2.7.1.90; PFP) activity was measured as described by Weiner et 
al. [22]. The assay mixture contained 50 mM Tricine/NaOH, 
pH 7.8, 0.5 mM MgCl r 10 jiM fructose 2,6-bisphosphate, 5 mM 
fructose 6-phosphate, 10 units of triose-phosphate isomerase, 
1 unit of glycerol-3-phosphate dehydrogenase (EC 1.1.1.8), 
0.1 unit of aldolase (EC 4.1.2.13), 0.15 mM NADH and 0.6 mM 
Na 4 P 2 0 7 . 

Thermolysin treatment of purified intact organelles 

Proteolytic digestion was performed in accordance with a pre- 
viously published procedure [24]. Intact chloroplasts or mito- 
chondria (final concentration 10 mg/ml protein) were incubated 
for 1 h at 4 °C in the following medium: 0.3 M sucrose/ 10 mM 
Tricine/NaOH (pH7.8)/lmM CaCl 2 /0.2 mg/ml thermolysin 
from Bacillus thermoproteolyticus (Boehringer). Thermolysin is 
active only in the presence of Ca 2+ ions and its activity can be 
easily inhibited by the addition of 10 mM EGTA. Polypeptides 
localized in the inner envelope membrane or in the stroma of 
chloroplasts, and also in the inner membrane or the matrix space 
of mitochondria, are not hydrolysed during the incubation 
because thermolysin is unable to cross the outer envelope 
membrane of chloroplasts or the outer membrane of mito- 
chondria. In addition, the integrity of organelles is maintained 
during incubations under mild conditions (at 4°C, with low 
thermolysin concentration) [24]. Therefore after incubation for 
1 h in the presence of thermolysin under these conditions, 
chloroplasts and mitochondria were re-purified on a Percoll 
gradient containing protease inhibitors (1 mM PMSF, 1 mM 
benzamidine/HCl and 5 mM e-aminohexanoic acid) to remove 
theTprotease and~the~ broken organelles. The treated^intacT 
organelles were recovered, stored on ice and assayed for HCS 
activity; the soluble enzymes were then analysed by chromato- 
graphy on a Mono Q anion-exchange column. 

Mono Q anion-exchange chromatography 

All chromatography experiments were performed at 4 °C with 
Mono Q HR 5/5 (Pharmacia) coupled to an FPLC system 
(Pharmacia) to obtain precise and repeatable elution patterns. 
Soluble protein extracts (crude leaf extract, chloroplast stroma 
and mitochondrial matrix) prepared as described above were 
desalted on a Sephadex G-25 (M) column in a medium containing 
20 mM Tris/HCl, pH 8, 1 mM EDTA, 1 mM DTT, 1 mM 
benzamidine/HCl and 5 mM e-aminohexanoic acid and then 
loaded (2-5 mg) on the anion-exchange column pre-equilibrated 



with the same buffer. After loading, the column was washed with 
5 ml of buffer and eluted with the following linear NaCl gradients : 
0-0.3 M NaCl (30 ml), 0.3-0.5 M NaCl (10 ml), 1 M NaCl (5 ml) 
in the same buffer at a flow rate of 0.5 ml/min. Fractions of 1 ml 
were collected and assayed for HCS activity. 



Latency measurements 

Corrections were made for extra-organellar activity by comparing 
the activities in ruptured and intact organelles. The organelles 
were kept intact by adding an osmoticum to the reaction medium 
(0.3 M sucrose) or ruptured by adding 0.05 % (w/v) Triton X- 
100. The percentage of latent activity is the ratio of organellar 
(ruptured minus intact) activity to the total (ruptured) activity. 
We verified that, under these conditions, enzymic activities were 
not affected by the presence of the detergent. 



Protein and chlorophyll determinations 

Protein was measured by the method of Bradford [25] using Bio- 
Rad protein assay reagent (Bio-Rad Laboratories) with bovine 
y-globulin as a standard. Chlorophyll was measured by the 
method of Arnon [26]. 

Bacterial strains and growth conditions 

A temperature-sensitive E. coli birA215 mutant (strain BM4050) 
lacking HCS activity in vitro was generously provided by Dr. 
A. M. Campbell (Stanford University, Stanford, CA, U.S.A.) 
[27,28]. Mutations in the birA gene affect the biotin ligase 
function of the BirA protein, resulting in biotin auxotrophy. 
This strain grows normally at 30 °C on minimal medium M9 
(48 mM Na 2 HP0 4 /22 mM KH 2 P0 4 /19mM NH 4 Cl/8.5 mM 
Nad/1 mM MgSO 4 /0.1 mM CaClJ supplemented with 0.2% 
glucose, 2 nM biotin, 0.4% casein hydrolysate, 100/£g/ml TTC 
and 1 /ig/ml thiamin. In contrast, bacteria fail to grow under the 
same conditions (either in liquid medium or on plate) but at a 
temperature of 43 °C, since at this temperature the mutant 
biotin apoprotein ligase has a greatly decreased affinity for 
biotin. When required, carbenicillin was added at 100/ig/ml. 
Competent cells, grown in Luria-Bertani broth, were prepared 
according to the method of Dower et al. [29]. 

Measurement of HCS activity 

HCS activity was determined by measuring the covalent at- 
tachment of D-PHJbiotin to bacterial apo-BCCP by a trichloro- 
acetic acid precipitation assay as described previously [17]. The 
-apo-BGGP-protein-substrate-was -prepared from -the ~E.~coli- 
birA215 mutant grown on glucose minimal medium supple- 
mented with limiting amounts of D-biotin (0.4 nM) [17]. The 
reaction mixture contained the following components in a total 
volume of 200 /d: 5 mM ATP, 5 mM MgCl 2 , 0.2 mM DTT in 
20 mM phosphate buffer (pH 7.5), 10-200 /tg of enzyme extract, 
180/tg of apo-BCCP extract and 250 nM D-PHjbiotin. The 
reaction was initiated by addition of the labelled D-biotin. 
Incubations were for 20-60 min at 37 °C. Then 125 p\ aliquots of 
the reaction mixture were collected on glass microfibre filters 
(Whatman GF/Q and proteins were precipitated by five washes 
of the filters in 10% (w/v) trichloroacetic acid. The filters were 
washed once with ethanol and dried; radioactivity was counted 
in scintillation vials containing 8 ml of scintillation liquid (Ready 
protein; Beckman). Duplicate assays without apo-BCCP and/or 
HCS extract were run as controls. 
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Isolation of A thaBana cDNA clones encoding HCS by functional 
complementation of f. coB birA215 

A AYES-R cDNA expression library prepared with mRNA 
species from A, thaliona plants was first converted into a plasmid 
library carrying ampicillin resistance, as described by Elledge et 
al. [30]. E. coli birA215 competent cells were transformed by 
electroporation with this plasmid library by using a Bio-Rad 
Gene Pulser operating at 15kV/cm pulse, 25 /tF and 200 CL 
Then bacteria were suspended in 1 ml of SOC broth [31] and 
incubated at 37 °C with shaking (250 rev./min) for 1 h. Cells 
were washed twice in M9 medium to eliminate free biotin and 
plated on M9 plates, supplemented with 2nM biotin, 0.4% 
casein hydrolysate, 100/ig/ml TTC, 1 jig/m\ thiamin and 
100/fg/ml carbenicillin. Cell culture was conducted at 43 °C to 
select rescued clones. Transformation efficiency was determined 
by cell growth at 30 °C on identical M9 plates. Eight successive 
electroporations were performed as described above, with a 
transformation frequency of approx. 3 x 10® transformants per 
fi% of plasmid DNA. Plasmids were isolated [31] from survivor 
colonies growing at 43 °C on M9/carbenicillin plates, and the 
original E. coli birA215 mutant was transformed again with the 
purified plasmids to confirm the functional complementation. 
Only transformants that survived the second round of selection 
were used for further analyses. 

Preparation of bacterial crude extract 

To assess the recombinant HCS activity, the biotin auxotroph 
bacterial strain £. coli birA215 (strain BM4050) (un trans formed 
and transformed) as well as the control wild-type strain E. coli 
BirA + (strain BM2661) were grown on Luria-Bertani liquid 
medium (supplemented with carbenicillin in the case of trans- 
formed bacteria) at 37 °C, until A soo reached 0.5. After induction 
of the pYES lac promoter with 1 mM isopropyl y£-r>thiogalacto- 
side for 4 h, cells were collected by centrifugation at 3000 g for 
15min (JA 20 rotor; Beckman) and suspended in buffer A 
containing 20 mM phosphate buffer, pH 7.5, 1 mM Na 2 EDTA, 
1 mM DTT and a mixture of protease inhibitors (1 mM PMSF, 
5 mM e-aminohexanoic acid and 1 mM benzamidine/HCl). Cells 
were then disrupted by sonication at 0°C and lysates were 
centrifuged for 15 min at 15000^ to remove cell debris (J A 20 
rotor; Beckman). The supernatant was desalted by passage 
through a Sephadex G25 (M) column (Pharmacia) equilibrated 
in buffer A. 

Subcloning and DNA sequencing 

The cDNA inserts were subcloned into the EcoRI site of vector 
plasmid p Bluescript II SK(-) (Stratagene). DNA sequence 
analysis was performed on both strands byusihg Prism Kit with 
fluorescent dideoxynucleotides, Tag DNA polymerase (Applied 
Biosystems) and T3 and T7 universal primers. In addition, 
specific oligonucleotide primers were used for further sequencing. 
Gene Works 2.4 and PCGENE (IntelliGenetics) software 
packages were used for sequence analyses. 

Transcription-translation of pBluescript HCS in vitro 

Transcription-translation in vitro of pBluescript HCS (which 
contains the entire region of A. thaliona HCS cDNA cloned 
under the control of the T7 promoter) was done with T7 RNA 
polymerase and a rabbit reticulocyte lysate from Novagen kit 
(Single Tube Protein* System 2, T7) and [^methionine 
(Amersham) in accordance with the instructions of the manu- 
facturer. For the reaction the DNA template (2 p% of pBluescript 



HCS-2) was transcribed at 30 °C for 15 min, followed by 
translation reaction (in the presence of 40 /id of P*S]methionine) 
at 30 °C for 60 min. A control reaction lacking a DNA template 
was performed under the same conditions. The translational 
products were precipitated with 10% (w/v) trichloroacetic acid. 
After a 1 h incubation at 0 °C and centrifugation at 30000 g for 
15 min at 4 °C (JA 20 rotor; Beckman), the precipitated protein 
pellets were washed with cold acetone, air-dried and resuspended 
in 50 mM Tris/HCl (pH8)/lmM PMSF/5 mM e-amino- 
hexanoic acid/1 mM benzamidine/HCl. Polypeptides labelled 
with p*S]methionine were analysed by SDS/PAGE and fluoro- 
graphy. SDS/PAGE [10% (w/v) gel] was performed at room 
temperature in slab gels (15 cm x 15 cm). The experimental 
conditions for gel preparation, sample solubilization, electro- 
phoresis and gel staining were as detailed by Chua [32]. After 
staining, gels were soaked in Amplify solution (Amersham) and 
dried before fluorography for 1 week on X-ray-films. 

Genomic Southern blot analysis 

Total DNA was isolated from young A. thaliona plants. Approx. 
2-5 /ig of DNA was digested overnight with appropriate re- 
striction enzymes (New England Biolabs) and fragments were 
separated by 0.8% agarose gel electrophoresis. After blotting to 
a Hybond-N + membrane (Amersham), hybridization was per- 
formed with the entire A. thaliona HCS-2 cDNA or the 
EcoRl-EcoRV 5' end of the pBluescript HCS-2 cDNA probes 
that had been 32 P-labelled with a random priming kit (Pharmacia) 
[31]. The nucleic acid hybridization solution was composed of 
6xSSC (the stock solution was 20xSSC, containing 175.3 g/1 
NaQ and 88.2 g/1 sodium citrate, pH 7), 0.5% SDS and 0.25% 
low-fat dried milk. Hybridization proceeded overnight at 65 °C 
and membranes were washed at 65 °C for 30 min in 1 x SSC/0. 1 % 
SDS and 0.1 x SSC/0. 1 % SDS. 

RESULTS 

Intracellular localization of HCS activity in pea leaves 

Protoplasts, isolated by enzymic digestion from young pea leaves 
(9 days old), were fractionated by gently rupturing through a fine 
nylon mesh followed by centrifugation to yield an organelle 
fraction (pellet) and a cytosolic fraction (supernatant). These two 
fractions were assayed for HCS activity, and their purity was 
assessed by measurement of selected subcellular marker enzyme 
activities. Marker enzymes used were fumarase for mitochondria, 
GraPDH for chloroplasts, and PFP for cytosol. As shown in 
Table 1, most of the chloroplast (83%) and mitochondrial 
(87 %) marker activities were recovered in the pellet. Because it 
was difficult to rupture the small protoplasts quantitatively 
without~affecting~chloroplast~integrity, we~peiformed~an-in-— 
complete rupture of the protoplasts to obtain only minimal 
chloroplastic contamination in the cytosolic fraction (super- 
natant). Thus approx. 31 % of the total cytosolic marker activity 
was recovered in the pellet. The supernatant obtained after 
differential centrifugation of lysed pea leaf protoplasts contained 
a substantial proportion of cytosolic marker enzymes (63 % of 
the total activity) and was only slightly contaminated by chloro- 
plast and mitochondrial enzymes (Table 1). HCS activity was 
detected both in the supernatant (59 %) and in the pellet (38 %). 
As the activity of HCS in the organelle pellet was not very 
different from the cytosolic contamination level, it was difficult 
to conclude that chloroplasts and/or mitochondria contained an 
HCS form. From these results it is clear that, in higher plant 
cells, HCS activity is located mostly in the cytosol (probably up 
to 90 % of total cellular activity). 
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Table 1 Subcellular localization of HCS in pea leaf protoplasts 

HCS activity was assayed in broken pea teat protoplasts (protoplast extract. 1 2.5 mg of protein), 
organelles (pellet) and cytosol (supernatant) as described in the Materials and methods section 
in the presence of 0.05 % (w/v) Triton X-1 00. Also indicated are the activities of marker enzymes 
to check for cross-contamination. The marker enzymes were as follows: cytosol. PFP; 
chloroplasts. GraPDH; mitochondria, fumarase. The results presented for distribution in the 
supernatant and pellet and for recovery are expressed as a percentage of the total activity 
recovered, and are means ±S.D. tor four different experimental determinations. 



Enzyme activity 





PFP 


GraPDH 


Fumarase 


HCS 


Activity in protoplast extract 


154 x10" 9 


1108 x10" 9 


45X10" 9 


0.61 x 10~ 12 


(mot/min) 










Distribution in pellet (%) 


31±5 


83±1 


87 + 3 


38+1 


Distribution in supernatant (%) 


63±8 


7 + 2 


3±1 


59±9 


Recovery (%) 


94±13 


90±3 


90 + 4 


97 + 10 



Table 2 DistrflHrtion of activities of HCS and marker enzymes in PercoD- 
puriTied chloroplasts and mitochondria from pea leaves 

Preparation of purified organelles, measurement of different enzymic activities in the presence 
of 0.05% (w/v) Triton X-1 00, and latent activity definition were as described in the Materials 
and methods section. The marker enzymes were identical with those used for the fractionation 
of purified pea protoplasts (Table 1). Results are from a representative experiment repeated three 
times. Abbreviation: n.d. ( not detectable. 





Enzyme activity (mol/min per mg of protein) 


HCS 

latency 

W 


Sample 


PFP 


GraPDH 


Fumarase 


HCS 


Chloroplasts 
Mitochondria 


n.d. 
n.d. 


342 x1Q- 9 
12x10 -9 


8x10" 9 
870 x 10" 9 


0.026 x 10~ 12 
0.041 x 10~ 12 


80 
92 



With the aim of determining the possible occurrence of HCS 
activity in plant cell organelles, we conducted a large-scale 
purification of intact pea leaf chloroplasts and mitochondria on 
Percoll gradients- Assays of various selected marker enzymes for 
chloroplasts (GraPDH), mitochondria (fumarase) and cytosol 
(PFP) showed that this purification procedure completely 
eliminated cytosolic contamination. Furthermore, both the 
chloroplast and the mitochondrial fractions were essentially free 
from mitochondrial and chloroplast contamination, respectively 
(Table 2). HCS activity was found to be associated with both 
mitochondria and chloroplasts, with a specific activity of 0.041 
- and 0.026 pmol/min per- mg of-protein respectively (Table 2). On- 
a protein basis, these values were one-quarter to one-sixth of 
those measured in the cytosolic fraction prepared from frac- 
tionated protoplasts (0.17 pmol/min per mg of protein). Never- 
theless the total absence of cytosolic contamination and the high 
latency values of HCS activity measured in both purified chloro- 
plasts and mitochondria demonstrated that the enzyme activity 
was present within the organelles (Table 2). 




0 10 20 30 40 50 
Fraction number 



Figure 1 Separation of multiple forms of HCS from pea leaves by ion- 
exchange chromatography 

Pea leaf extracts were fractionated by chromatography on a Mono 0 HR5/5 column in 20 mM 
Tris/HCt (pH 8)/1 mM EDTA/1 mM DTT/5 mM e-aminohexanoic acid/1 mM benzamidine/HCI 
at 4 °C. After being loaded, the column was washed with 5 ml of buffer and eluted with a NaCI 
gradient at 0.5 ml/min. Fractions of 1 ml were collected and assayed for HCS activity. The 
samples loaded were: (A) leaf crude extract (5 mg); (B) cytosol (2 mg); (C) mitochondrial 
extract (23 mg); (D) chloroplast extract (4.9 mg); (E) extract from thermolysin-treated 
chloroplasts (3.8 mg). 



Separation of HCS activities by Mono Q anion-exchange 
chromatography 

To determine whether HCS activities found in pea leaves 
represented different forms of the enzyme, extracts from pea 
leaves (crude leaf extract, soluble proteins from Percoll-purified 
chloroplasts and mitochondria, and cytosol from leaf proto- 



plasts) were fractionated on a Mono Q HR 5/5 column. Recovery 
of HCS activity was in the range 85-90% for each of the 
chromatographies performed- The elution profile obtained for 
the leaf crude extract (Figure 1 A) shows that HCS activity can be 
resolved into two peaks (A and B). The major peak (peak B), 
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eluted at 140 mM NaCl, represented approx. 90% of the total 
activity. Peak A, eluted at 50^-60 mM NaCl, contained approx. 
10% of the total activity present in the crude extract. The 
cytosolic pattern (Figure 1 B) exhibited only one peak of HCS 
activity, displaying chromatographic properties similar to those 
of peak B in the crude extract- In purified mitochondria (Figure 
1Q, HCS activity was also eluted as a single peak, but with a 
profile similar to that of peak A in the crude extract. Finally, the 
elution profile obtained with the chloroplast extract (Figure ID) 
contained three peaks of HCS activity. The major peak (85% of 
the total activity) and one of the minor peaks (10% of the total 
activity) had similar profiles to those of peak A and peak B in the 
crude extract respectively. The last minor peak detected in 
chloroplast extracts was eluted at approx. 170mM NaCl, and 
was referred to as peak C. From these results we can conclude 
that peak A is associated with both mitochondria and chloro- 
plasts, and peak B with cytosol. Peak C, which was detected in 
chloroplasts but not in crude extracts, might correspond to a 
degraded form of chloroplast HCS. Finally, peak B, the com- 
ponent of the cytosolic fraction, was always found in the 
chloroplast extract, although cytosolic contamination of chloro- 
plasts was always negligible (Table 2). This observation led us to 
question whether some cytosolic HCS could be loosely adsorbed 
to the outer surface of the outer membrane of the chloroplast 
envelope, to be released during the breakage of chloroplasts by 
osmotic shock and one cycle of freeze-thawing. To verify this 
hypothesis, we used a mild proteolytic digestion of intact 
chloroplasts with thermolysin because this non-penetrating pro- 
teolytic enzyme has been demonstrated to be an efficient tool for 
characterizing those envelope proteins that are accessible from 
the cytosolic side of the outer membrane [24]. The elution pattern 
of the extract obtained from thermolysin-treated chloroplasts 
confirms that HCS activity was indeed a genuine constituent of 
chloroplasts because peaks A and C were not affected by the 
treatment (Figure IE). In contrast, peak B was no longer 
detectable in thermolysin-treated chloroplasts, thus providing 
clear evidence for an extra-plastidial localization of this HCS 
activity. Therefore, from the results presented in Figures 1(B), 
1(D) and 1(E), and in Tables 1 and 2, we can conclude that peak 
B does indeed correspond to HCS present in the cytosol. As a 
control we used the same proteolysis treatment to confirm that 
HCS activity associated with the mitochondria was clearly inside 
the organelle. The identical elution profiles before and after 
thermolysin treatment confirmed the high latency of the activity 
associated with the mitochondria. 
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Figure 2 Nucleotide and predicted amino acid sequences of the cDNA 
encoding HCS from A. thaliana (HCS-2 done) 

The coding sequence is indicated with capital letters; the non-coding sequence is indicated with 
tower-case letters. Nucleotides are numbered at the right The first in-frame ATG is in bold and 
the corresponding stop codon is marked with an asterisk. The longest open reading frame 
extends for 1101 bp and translates into a 367-residue protein with a molecular mass of 
411720a. 



and temperature-sensitive growth of the birA215 host, and were 
referred to as HCS-1, HCS-2, HCS-3 and HCS-4. 



Isolation of cDNA clones encoding HCS by complementation of 
temperature-sensitive BM4050 cells 



Characterization, nucleotide sequences and deduced amino add 
sequences of A fto/Zana HCS cDMA 



To characterize the HCS isofonns further at the molecular level, 
we developed a functional complementation screening technique, 
using a higher plant cDNA expression library, and an E. coli 
mutant affected in HCS activity. The E. coli birA215 competent 
cells lacking endogenous biotin ligase activity when grown at 
43 °C were transformed by electroporation with approx. 6 x 10* 
plasmids expressing an A. thaliana cDNA library (initially 
containing 10 7 independent recombinants [30]). Isolation of E. 
coli birA2l 5-complemented clones was attempted by selection on 
M9 plates containing 0.2% glucose, 2 nM biotin, 0.4% casein 
hydrolysate, 100/fg/ml TTC, 1 /tg/ml thiamin hydroxychloride 
and 100/ig/ml carbenicillin at 43 °C. After 48 h of growth at 
43 °C, six colonies were isolated. When these clones were cultured 
at 43 °C in M9 liquid medium supplemented as above, four of 
them retained the ability to complement the biotin auxotrophy 



The length of the cDNA inserts was found to be approx. 1 .4 kb 
in HCS-1 and HCS-4, and 1.5 kb in HCS-2 and HCS-3. The four 
cDNA inserts shared the same internal sequence, but differed 
only in the length of their untranslated 5' and 3' ends. Thus HCS- 
2 and HCS-3 were found to be identical, whereas HCS-1 carried 
a 33 bp extension on the 5' end and HCS-4 had a 23 bp deletion 
on its 5' end compared with HCS-2 and HCS-3. Finally, all four 
clones contained a poly(A) tail, although none of them exhibited 
a typical eukaryotic polyadenylation signal sequence [33]. The 
position of the poly(A) tail with respect to the TGA stop codon 
varied depending on the clone. Thus the distance between the 
poly(A) tail and the TGA stop codon was 3 1 5 bp for clones 
HCS-2 and HCS-3, and 252 bp for clones HCS-1 and HCS-4. 

The complete nucleotide sequence of the longest isolated 
cDNA (HCS-2) is shown in Figure 2 along with the deduced 
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Figure 3 HCS activity in crude protein extracts from £. co/i birA21S strain 
transformed or not with the four HCS cDNA clones and from f. coll bfrA* 
wild-type strain 

Total soluble proteins from the different strains were extracted after induction of the Lac 
promoter with tsopropyt /to-truogalactoside, for an optimal time of 4 h, as described in the 
Materials and methods section. HCS activity was determined as described previously by 
measuring the specific incorporation of o-( 3 H] biotin into bacterial apo-BCCP [17]. D-Biotin 
(1 pmol) corresponds to 95000 d.pjn. 



amino acid sequence. HCS-2 is a 1534 bp cDNA including one 
large open reading frame of 1101 bp. The first in-frame ATG 
occurring at nucleotide 50 of the cDNA initiates the longest open 
reading frame present on this cDNA and encodes a predicted 
polypeptide of 367 residues with a molecular mass of 41 172 Da. 
A second in-frame ATG is present 1 14 bp downstream from the 
first one. The nucleotide sequence around the first ATG codon, 
TTTAATGGA (positions 46-54), differs from the plant con- 
sensus translation initiation motif, AACAAUGGC [34]. In 
contrast, that for the second, AGC AATGG A (positions 
160-168), more closely matches the plant consensus sequence. 
Finally, the presence of an in-frame nonsense codon TGA, 
located 27 bp upstream from the first ATG, confirms that HCS- 
2 is full length. 

Functional characterization of HCS cDNA 

To confirm that the four cDNA species encoded HCS, we 
measured this enzyme's activity in crude protein extracts obtained 
from the E. coli birA215 mutant complemented with each of the 
four clones, by using bacterial apo-BCCP as the biotin acceptor 
substrate (Figure 3). All four clones had the same orientation in 
PX E § vector and were under the cont rol of the la c promoter. 
Whereas no HCS activity could be detected with the birA215 
mutant, the four complemented clones exhibited significant HCS 
activity (Figure 3). Thus levels of HCS activity from the 
recombinant clones were 13-14-fold higher than that from the 
wild-type birA + (BM2661) strain (Figure 3). 

Transcription-translation of pBluescrlpt HCS-2 in vitro 

Because the complete nucleotide sequence contains two in-frame 
ATG sequences, it is possible that isolated HCS cDNA species 
encode two distinct polypeptides. Coupled transcription- 
translation of HCS-2 cDNA in vitro subcloned in pBluescript 
under the control of the T7 promoter, using T7 RNA polymerase 
and a rabbit reticulocyte lysate, produced two major translation 
products of approx. 37 and 41 kDa, i.e. of the expected sizes for 
an initiation of translation at the two in-frame ATG sequences 
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Figure 4 Coupled transtripuon-tanslation of pBluescript HCS-2 in vitro 

Polypeptides radioactrvefy labelled with [^methionine were subjected to SDS/PAGE and 
analysed by fluorography as described in the Materials and methods section. Lane 1, 
fluorography of radiolabeled polypeptides obtained from a control reaction lacking DNA 
template; lane 2, fluorography of translation products obtained with 2 fig of pBluescript HCS- 
2. The positions of molecular mass markers (given in kDa) are indicated on the left. 



(Figure 4, lane 2). The additional labelled polypeptide of mol- 
ecular mass 52 kDa did not correspond to a translation product 
of the HCS-2 cDNA because it was also detected in the control 
experiment lacking the DNA template (Figure 4, lane 1). It seems 
unlikely that the production of the two polypeptides can be 
explained by a premature termination of translation because a 
similar result was obtained with wheat germ extract as the 
translation system in vitro (results not shown). 

Southern blot analysis 

Southern analysis was used to examine the number of genes 
encoding HCS in A. thaliana. Total DNA was digested with 
restriction enzymes that cut once (EcoRV) or do not cut (EcoRl, 
Hindlll) within the cDNA, and the fragments were resolved by 
agarose gel electrophoresis. After transfer to a Hybond-N + 
membrane, the resultant blot was probed with the 32 P-labelled 
complete HCS-2 insert as described in the Materials and methods 
section. As shown in Figure 5, digestion with EcoRl or Hindlll 
produced two bands (7. 1-4.6 and £-3.4 kb), whereas digestion 
with EcoRV resulted in three hybridization bands of 13, 12 
(arrowheads) and 1 .7 kb. After double digestion of total DNA 
with EcoRV and Hindlll, the probe detected three bands, a 
1 .7 kb band generated by EcoRV, a 3.4 kb band generated by 
Hindlll and a 6.2 kb band. Hybridizations were also performed 
with a restriction fragment corresponding to the 501 bp 5' end 
(EcoRl-EcoRV fragment) of the pBluescript HCS-2 cDNA. 
~After~digestibn with EcoRl or Hindlll asimilar pattern _ of 
hybridization was obtained, whereas after digestion with EcoRV 
or double digestion with EcoRV and Hindlll the 5'-end probe 
revealed only two bands in both cases, that is the 1 3 and 1 .7 kb 
bands and the 3.4 and 1.7 kb bands respectively (results not 
shown). Altogether, these results are consistent with the existence 
of two related genes encoding HCS in A. thaliana. However, such 
a banding pattern could also be generated by a single gene 
containing large introns. This last possibility cannot be com- 
pletely ruled out. 

DISCUSSION 

Fractionation of pea leaf protoplasts and purification of chloro- 
plasts and mitochondria from this tissue clearly indicate that 
HCS activity is associated with several subcellular compartments. 
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figure 5 Southern Wot hybridization of total A thaliana DNA with the 
complete HCS cDNA as a probe 

Total leaf DNA from A tfte/ferawas digested (2-5 jig per reaction) with feoRI (lane 2), WwdlH 
(lane 3), fi#RV (lane 4) and Wfldlll-fcoRV (Jane 5). Lane 1 corresponds to size markers 
(ADNA Mffdlll digest). DNA restriction fragments were separated on a 0.8% agarose gel, 
transferred to Hybond-N + membrane and then hybridized with the ^P-labelted probe 
corresponding to the pBluescript HCS-2 cDNA excised with EcoHl 



HCS was resolved into two peaks by anion-exchange chromato- 
graphy. The main peak (making up approx. 90% of the total 
activity) was located in the cytosol. The other peak was present 
in both chloroplasts and mitochondria. The great purity and the 
higher latency values of HCS activity measured in Percoll- 
purified chloroplasts and mitochondria, together with the pro- 
tection of the enzyme activity in these organelles during thermo- 
lysin treatment, confirmed that HCS is a genuine constituent of 
chloroplasts and mitochondria. 

Detailed characterization of the incorporation of D-biotin into 
plant biotin-dependent carboxylases requires large quantities of 
HCS isoenzymes. To overcome the very low abundance of these 
enzymes in plants, as judged by the low specific activities detected 
in each cell compartment, we attempted to obtain the cDNA 
species encoding these isoforms in order to over-express it. The 
biotinylation reaction is evolutionarily conserved such that biotin 
ligases function with various apocarboxylases across species 
boundaries [2]. We have therefore used the cross-species reactivity 
of biotin ligases to clone a cDNA encoding a complete A. 
thaliana HCS by functional complementation of a mutant bir\ 
strain of £. coli. Selection of plant cDNA species encoding HCS 
by this technique confirms our~previous remits showing that pea 
HCS efficiently biotinylates bacterial apo-BCCP in vitro [17]. 

Comparison of the predicted protein sequence of the isolated 
plant HCS cDNA with those compiled in the GenBank and 
EMBL databases showed low but significant similarity to biotin : 
apoprotein ligases from E. coli (17% identity; 33% similarity) 
[4], B. subtilis (18 % identity; 31 % similarity) [8], P. denitrificans 
(14% identity;25% similarity) [7], Homo sapiens (24% identity; 
40% similarity) [13,14] and S. cerevisiae (22% identity; 40% 
similarity) [12] (Figure 6). Specific areas of similarity are restricted 
to a region known in E. coli BirA to contain the biotin-binding 
site [6]. Thus, across a restricted 125-residue region (positions 
122-246 of A. thaliana HCS), the plant enzyme shares 33% 
identity (51% similarity) with BirA protein from E. coli, 
33 % identity (47 % similarity) with a putative homologue of BirA 
protein from B. subtilis, 30% identity (47% similarity) with the 



candidate for BirA protein from P. denitrificans, 30% identity 
(49% similarity) with HCS from H. sapiens and 34% identity 
(51 % similarity) with HCS from S. cerevisiae. Most importantly, 
the eight residues involved in direct contact with biotin in BirA 
from E. coli, as determined by X-ray crystallography [6], are 
strictly conserved in A. thaliana HCS. These residues are Ser 1 ", 
Thr 1 " Gin" 5 , Arg 1 " Lys* 20 , Gly tt3 , GIy"° and Gly" 2 . Interest- 
ingly, similarity to chicken avidin, the protein with the highest 
known affinity for biotin [35] was also observed, particularly in 
the region located between Asp* 05 and Thr 129 of the predicted 
amino acid sequence (Figure 6). The sequence identity in this 
zone was 36%, and increased to 60% when conservative 
substitutions were included. Altogether these results confirm the 
finding that this region is essential for biotin binding. Within this 
region, the sequence GRGRTK is present at positions 148-153 
and partly matches that found in other biotin ligases (GRGRRG) 
(Figure 6). Although no crystallographic evidence for the ATP- 
binding site has been demonstrated [6], the structure GXGXXG 
has been associated with ATP binding in several enzymes [36,37]. 
However, it also seems to be involved in contact with biotin in 
BirA [6]. Presumably this reflects a requirement for ATP and 
biotin to be spatially close to permit the formation of biotinyl 5'- 
adenylate, an intermediate of the biotinylation reaction. Sequence 
comparisons within the helix-turn-helix DNA binding motif of 
E. coli and B. subtilis BirA proteins (residues 22-46 and 23-^*7 
respectively) [4,8] revealed no region of similarity, suggesting 
that in contrast with what occurs in bacteria, plant HCS is not 
involved in the repression of biotin synthesis. This observation is 
in good agreement with previous findings showing that the free 
D-biotin concentration in the cytosol of plant cells is of the order 
of 11 [17,18]. Indeed, this level, which is approx. 2000-fold 
that found in bacteria [38], was not compatible with the existence 
of a strong repressor of biotin synthesis in plants, comparable 
with the repressor function of BirA protein, which regulates the 
level of biotin in bacteria. Finally, there are sequences conserved 
between A. thaliana HCS and the two other eukaryotic proteins 
that are not found in BirA from bacteria (Figure 6). For example, 
in the C-terminal portion of the predicted A. thaliana HCS 
sequence (residues 301-367), the plant enzyme shares only 12 % 
identity (19% similarity) with BirA from B. subtilis compared 
with 37 % identity (58 % similarity) with HCS from S. cerevisiae. 

Comparison of the N-terminal portion of known HCS 
sequences indicates that the A. thaliana protein contains an N- 
terminal extension of approx. 30 amino acids compared with 
those from bacteria (Figure 6). This N-terminal extension is rich 
in hydrophobic, hydroxylated and positively charged amino 
acids, but poor in acidic residues. In addition, theoretical 
secondary-structure predictions indicate that this sequence 
would fold into an amphiphilic cc-helix. Thus this extension has 
maTiychalracteris^ — 
plast presequences. Furthermore, analysis of the primary se- 
quence of the plant HCS sequence by the subroutine TRANSEP 
from the program PCGENE (IntelliGenetics) predicted that this 
protein might be targeted to an organelle, possibly the mito- 
chondrion, and identified the sequence Arg* 9 -Leu/Ser-Phe as a 
putative cleavage-site motif [39]. However, the occurrence of an 
in-frame Met residue at position 39, which might act as a 
potential translation initiation site, as indicated by transcription- 
translation experiments in vitro (Figure 4), might indicate that 
the present clone encodes, in reality, a cytosolic HCS isoform. 
Therefore it is not possible from our data to determine the exact 
subcellular localization of the cloned A. thaliana HCS. Further 
studies on the uptake of the cloned plant HCS by mitochondria 
and/or chloroplasts, as well as in situ localization experiments on 
plant cells overexpressing this clone are in progress to make a 
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Figure 6 Amino acid sequence comparison between biotin Bgases and 
chicken avidin 

Sequences are shovm for A thaliana HCS, £ colt BirA protein [4], ft subtitis BirA protein [8], 
P. denitrificans putative biotin iigase [7], H. sapiens HCS [13.14J, S. cerevisiae HCS [1 2] and 
chicken avidin [35], The sequences are aligned with gaps (-) to maximize identity. Conserved 
amino acids represented in bold are reported in the deduced consensus line (Consensus). 
Conservative amino acid substitutions were determined in accordance with the following 
grouping: l-L-M-V, N-Q, R-K-H, A-S-P-T-G, Y-F-W and D-E. The amino acid positions for 
H. sapiens and 5. cerevisiae HCS and chicken avidin are shown at the right. 



definitive assignment of the cellular localization of this clone to 
a specific compartmentrlt is interesting to note a recent proposal- 
that human cytosolic and mitochondrial HCS isoforms are 
synthesized from a single species of mRNA either by alternative 
translational initiation at two in-frame AUGs or by a splicing 
mechanism [14]. 

Finally, the presence of HCS isoforms in different cell com- 
partments, as determined biochemically in pea leaves, raises the 
question of their physiological significance in plants. In mammals, 
two isoforms of the same HCS targeted to cytosol and mito- 
chondria respectively have been characterized, catalysing the 
biotinylation of one cytosolic biotin-dependent carboxylase 
(ACCase) for the former and three mitochondrial biotin- 
dependent carboxylases (3-methylcrotonoyl-CoA carboxylase, 
propionyl-CoA carboxylase and pyruvate carboxylase) for the 
latter [9-1 1,14]. In plant cells, we and others have characterized 
different biotin-dependent carboxylases localized in three cell 



compartments, namely a mitochondrial 3-methylcrotonoyl-CoA 
carboxylase, a chloroplastic ACCase (of eukaryotic type in 
Gramineae and of prokaryotic type in other plants) and a 
putative cytosolic ACCase [40-45]. Thus the existence of HCS 
isoforms in different cell compartments suggests that the different 
biotin-dependent carboxylases in plants are biotinylated in the 
cell compartment within which they are localized. In a previous 
publication we showed that pea leaf cells contain a pool of free 
D-biotin localized in the cytosol, and no detectable levels of this 
vitamin in the organelles [18]. This therefore raises the question 
of how chloroplast and mitochondrial HCS forms might function 
in vivo. More recently we demonstrated that plant HCS displayed 
a very low K m for D-biotin of 28 nM [17]. As the detection limit 
for the biological assay used to determine free D-biotin levels in 
pea leaf cell compartments was of the order of 0.05 ng [18], it is 
possible that chloroplasts and mitochondria actually contain 
very low D-biotin concentrations that are sufficient to allow the 
biotinylation of biotin-dependent carboxylases present in these 
organelles by HCS isoenzymes. Further characterization of these 
enzymes, and particularly the identification of a possible struc- 
turally distinct HCS isoform, will be necessary for understanding 
the mechanism of biotinylation of apocarboxylases in plants and 
to elucidate the question of why HCS activity is com- 
partmentalized in plant cells. 

We thank Dr. A. M. Campbell (Stanford University) for supplying birA mutants; and 
Dr. D. Job, Dr. S. Ravanel, Dr. P. Baldet and Dr. R. Derose for helpful discussions 
and critical reading of the manuscript. This study was conducted under the BioAvenir 
program financed by Rhone-Poulenc with the contribution of the Ministere de la 
Recherche et de I'Espace and the Ministere de I'lndustrie et du Commerce Extirieur. 
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Role of avidin and other biotin-binding proteins in the deposition 
and distribution oHriotin in chicken eggs 

Discovery of a new biotin-binding protein 
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anothe SJ (WT> m * * t aracten f ed blotin " bindi ^ P»tcin (BBP-I), we have discovered 

IstebL S Is^r r I f P Y n k fr ° m !aying hens - BBPI is stable to 65 ° C ' whe ™* BBP-II 

is stable to 45 C. Both proteins are normally saturated with biotin and together they account for most, if 

not all of the biotin in hen plasma and yolk, except in hens fed excessive amounts of biotin (> 1 mg of 

biotin/kg ; of feed) ^The maximal production of BBP-I is attained at lower levels of dietary biotin 

<~ S0/.g/kg) tium for ^ BBP-II (~ 250^g/kg); however, the maximal production of BBP-II is severalfold 

greater than tor BBP-I. Consequently, as dietary biotin increases, the ratio of BBP-II to BBP-I increases 

and becomes constant at dietary intakes of biotin above 250^/kg. The observation that the amounts of 

these proteins are limited by biotin in the normal dietary range ( < 250 /.g/kg) suggests that biotin is required 

for the synthesis, secretion or stability of these proteins. Although both plasma vitamin-protein complexes 

are transported to the oocyte and concentrated in the yolk, BBP-II is transferred more efficiently. Thus biotin 

deposition in the yolk is a function of the amounts and relative concentrations of the two proteins. Dietary 

biotin above 250/ig/kg exceeds the transport capacity of BBP-I and BBP-II in the plasma- however 

unbound biotin does not accumulate. Rather it is efficiently scavenged by avidin in the oviduct and 

trans erred to the egg albumen. Only when avidin becomes saturated at high dietary intake does free or 

weakly bound biotin accumulate in plasma and yolk. The synthesis of avidin is independent of dietary biotin 

Small amounts of BBPs with the heat-stability of avidin or BBP-I respectively are present in the plasma of 

adult males or immature chickens. BBP-II, the major BBP in the plasma and yolk of laying hens, was not 

detected in the plasma of non-laying chickens. 



INTRODUCTION 

In 1927 Boas showed that there was a nutritional 
factor in egg yolk and other foods that was inactivated 
by a heat-labile component of egg albumen. Biotin, the 
nutritional factor, was first isolated and characterized 
trom duck egg yolk (Kogel & Tonnis, 1936). Avidin, the 
antiyitamin, was purified from egg albumen (Pennington 
et aL, 1942) and shown to be an extraordinarily stable 
tetrameric protein whose affinity for biotin may well be 
the strongest non-covalent interaction between a protein 
no^ Sn l a11 molecule (Green, 1975). Gyorgy & Rose 
11*42) observed that biotin in egg yolk became 
aialysable on heating, but it was more than 30 years 
before a biotin-binding protein (BBP) was identified and 

niinhaVl r.»« 1 1 y-ii r» -\ ~ ' 'l : ~ 



?o^o C . d fr ° m egg yo,k OVhite "et a/., 1976; MeslarWa/., 
1978; Murty & Adiga, 1984). Though similar in size and 
I quaternary structure, BBP is distinct from avidin by a 
i. number of criteria. This protein is also present in the 
| plasma of laying hens (Mandella et a/., 1978) and is 
£ presumed to be deposited in the ovarian follicle along 
I other egg-yolk proteins (White, 1985). 
I The original assay for yolk BBP was based on the 
[ equilibrium exchange of endogenous bound biotin and 
^°£ enous [ 14 Qbiotin at 65 °C (Meslar & White, 1979). 
^Although this assay is quite satisfactory for partially 



purified BBP, it was technically difficult to perform on 
yolk extracts, and the limits of detection were approached 
with plasma samples. Furthermore, the calculations for 
this assay included corrections for isotope dilution which 
assumed that the protein was initially saturated with 
unlabelled biotin and that no additional endogenous 
biotin was present. Recently a new and much more 
sensitive assay based on the exchange binding of 
[ 3 H]biotin has been developed (White & McGahan 
1986; H. B. White, T. McGahan & M A. Letavic,' 
unpublished work). The graphical analysis of this assay 
yields the amount of endogenous ligand (Lotter et aL, 
1982). Although there are still some technical problems 
with this assay that cause underestimates of the amount 
_of_binding proteins, it-is now-possible-to-analyse BBP-in- 
samples not accessible with the previous assay. 

In the process of applying this new assay to yolk and 
plasma samples from lying hens that had been fed diets 
differing in their biotin content, we discovered that the 
endogenous biotin of content of yolk and plasma 
exceeded the BBP binding sites by severalfold. This was 
unexpected, because this excess biotin should have been 
dialysable. The paradox was resolved by the discovery of 
a second BBP (BBP-II) that binds most, if not all, of the 
biotin not bound by the previously recognized BBP 
hereafter designated ' BBP-I'. 



Abbreviation used: BBP, biotin-binding protein. 
IDE r9 r 7^u a s d A eSS addrCSS COrres P° ndence and re P rint r equ«ts: Department of Chemistry, University of Delaware, Newark, 
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Table 1. Composition of experimental diets 



H. B. White III and C. C. Whitehead 



Vitamin/mineral supptementsT>rovided (per kg of diet): A: copper, 3.6 mg; iodide, 0.4 mg iron 80 mtr maenesium 100 m» 
manganese, 100 mg; zinc, 50 mg; retinol, 600 ^g; cholecalciferol, 15^g; a tocopherol, 1 T^LnadiSw lTm E ribotn' 
4 mg; rncoUmc ac.d, 28 mg; panthothenu; acid, 10 mg; B: as for A, phis cyanocobalamin, 25 add 0 5 ^mg g Vy^d 0 x^^e , 

4 mg; thiamin, 2 mg; C: as for A, plus biotin, 100 /ig; choline chloride. 0.5 g. g ' pynaoxine > 



Ingredient 



Maize 
Wheat 
Starch 

Herring meal 
Meat and bone meal 
Soybean meal 
Casein (low-vitamin) 
Gelatin 

Egg albumen (spray-dried) 

Isolated soy protein 

Vegetable oil 

Cellulose 

Limestone flour 

Dicalcium phosphate 

Salt 

DL-Methionine 
Vitamin/mineral supplements 

A 

B 

C 



Laying diet 



750 

60 
30 
25 



20 



65 
22 
3 



Composition (g/kg) 



Biotin-deficient 
laying diet 



553 
80 



100 
48 
36 

30 
36 
76 
30 
4 
2 



Chick diet 

50 
596 

30 
145 

105 



40 



13 
12 
3 



In the present paper we document the presence of 
BBP-II and show that it is unstable under the conditions 
used for the assay of BBP-I. Furthermore we show that 
neither the deposition of biotin in yolk nor the 
distribution of biotin between yolk and albumen is a 
simple function of dietary biotin or plasma biotin 
concentration. The patterns can be explained by the 
differential synthesis and differential transport of BBP-I 
and BBP-II. 



MATERIALS AND METHODS 

Experimental design 

ISA Brown hens (96 in all, housed in individual 
battery cages) were maintained on a standard laying diet 
for several weeks until a high rate of egg production was 
- est ablished. The die t, the com positi on of w hich is given 
in Table 1, was based on wheat and contained a 
relatively low amount of available biotin 28 mg/kg), 
but was nevertheless thought to be aequate in ali 
nutrients for maximal egg production (Whitehead, 1 980). 
Seven groups of 12 hens each were fed the standard diet 
supplemented with 0, 100, 250, 500, 1000, 2000 and 
4000 jig biotin/kg of feed. An eighth group was fed a 
biotin-deficient diet (Table 1). This diet contained 
hen's-egg albumen as a source of avidin and was thought 
to be virtually devoid of available biotin. Daily egg 
productions and weekly feed consumptions were recorded. 
Plasma, yolk and albumen were collected weekly and 
analysed for biotin and biotin-binding proteins. 

In a second experiment, newly hatched chicks were 
sexed and then fed a diet (Table 1) containing about 



160 mg of available biotin/kg. Blood samples were taken 
from several chicks and pooled at various ages and 
analysed for BBPs. 

Sample preparation 

Plasma, yolk and albumen samples were obtained 
exactly as described by White et aL (1986) for similar 
studies on riboflavin, except that the 4-fold dilutions of 
egg yolk were made with 50 mM-sodium acetate, pH 5.5, 
containing 50 mM-NaCI. Samples were stored frozen at 
-20 °C until assayed for BBP. On the day of analysis, 
plasma samples were usually diluted 10-fold, yolk 
samples an additional 1 0-50-fold and albumen samples 
200-fold with the above buffer. Samples for biotin 
analysis were not diluted before freezing. 

Assays for BBP-I 

B3P^Lwa^assa pro- 
cedure analogous to that described for assaying 
riboflavin-binding protein (Lotter et aL 9 1982). A series 
of tubes containing 0.1 fid of D-[8,9- 3 H(n)]biotin (lot 
2169-146, 35.0Ci/mmoi; New England Nuclear Corp., 
Boston, MA, U.S.A.) and 20-240 ju\ of diluted plasma or 
yolk in the above sodium acetate buffer (total volume 
1.0 ml) was incubated at 65 °C for 40 min to equilibrate 
free and BBP-I-bound biotin. These conditions denature 
BBP-II. The cooled incubation mixtures were quantita- 
tively transferred to small phosphocellulose columns 
[0.25 ml bed volume in polypropylene pipette tips 
(Sarsted, no. 91-787)]. Uncomplexed biotin was eluted 
with two 1.0 ml buffer washes. BBP-I-bound biotin was 
then eluted directly into scintillation vials by washing the 
columns with 4 x 0.25 ml of the sodium acetate buffer 
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Egg biotin-binding proteins 

containing 2 M-NaCl. A portion (10 ml) of scintillation- 
counting fluid (Opti Phase 'X\ Amersham International) 
was added and the amount oTradioactivity determined 
in a liquid-scintillation counter. Non-specific binding was 
determined in the presence of 500-fold excess of 
unlabelled biotm, and an avidin solution was used to 
determine total bindable radioactivity. Data were 
plotted according to the following linear equation (White 
& McGahan, 1986) (the slope and intercepts were 
determined by linear-regression analysis): 
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If L T /*L B , the ratio of total to bound radioactive 
ligand is plotted as a function of *L T F/V 9 where Fand 
V are the dilution factor and volume of the diluted 
sample used in the assay; the ^-intercept, L T /P T , is the 
ratio of endogenous biotin to biotin-binding sites and 
the reciprocal of the slope, [/> T ], is the concentration of 
binding protein in the undiluted sample. 

Assays for BBP-II 

An assay that directly measures BBP-II in the presence 
of BBP-I has not been developed. Two procedures have 
been used to estimate BBP-II activity. The first method 
was exactly like that described for BBP-I except that the 
incubation temperature was 45 °C instead of 65 °C The 
choice of temperatures was based on the thermal-stability 
profiles presented in Figs. 5 and 6 (below). The difference 
between the results from the 45 °C and 65 °C assays was 
attributed to BBP-II, but this approximation is in error 
to the extent that BBP-I does not achieve equilibrium 
exchange at 45 °C. Alternatively BBP-II activity has been 
estimated from the BBP-I assays by assuming that BBP-II 
is saturated with biotin and all endogenous biotin in 
excess of BBP-I was bound to BBP-II. Although both of 
these assays are qualitatively reliable, the values 
obtained are systematically low compared with the 
Values expected from bioassay for biotin. 

Assays for avidin 

_ Avidin was measured in albumen or plasma exactly as 
lor BBP-I, except the incubation was at 85 °C a 
temperature that destroys BBP-I and BBP-II. Unoccu- 
pied biotin-binding sites were determined at room 
temperature, where exchange does not occur. 

Biotin analyses 

Undiluted yolk, plasma and albumen samples were 
frozen and sent with feed samples to F. HofTmann- 
J^Qche, Basel,_Switzerland, -where they were assayed— 
for biotin "by using Lactobacillus plantarum as described 
by Fngg & Brubacher (1976). 
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1 0 2.0 3.0 

Biotin (mg/kg of feed) 
Fig. 1. Effect of dietary biotin on the concentration of biotin the 
plasma of laying hens 

Open circles (O) are the control values for the treatment 
groups immediately before the experimental diets were 
begun. The dotted line represents the average of these 
control values (33.9 ± 1.99 ^g/1). Closed circles {#) 
represent the average ±s.d. for samples taken after 1, 2, 
and 5 weeks on the experimental diets. The right-hand 
scale is based on a calculated blood volume of 109 ml. 
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Fig. 



0 1.0 2.0 

Biotin (mg/kg of feed) 
2. Effect of dietary biotin on the concentration of biotin i 
chicken egg yolk 

Open circles (O) represent control values for yolk samples 
from the treatment group immediately before the 
experimental diets were begun. The dotted line represents 
the average of these control values (343.1 ±48.3 /*g/kg). 
The divided circles (0) represent the average for duplicate 
yolk samples after the birds had been 1 week on the 
experimental diets. Closed circles (#) represent the 
mean±s.o. for duplicate samples taken after the birds had 
~been-2-and— 5-weeks on the experimental-diet. The" 
right-hand scale is based on an average yolk weight of 
17.6g. 



RESULTS 

Biotin content of diets 

inTn e ^l S were formul ated to contain 30, 100, 250, 500, 
1000, 2000, and 4000 fig of biotin/kg. The analyses of 
inese diets showed respectively 93, 165, 332, 622, 1054, 
1*31 and 3540 fig total biotin/kg. The data plotted in 
ine Figures correspond to available biotin which is about 
P^g/Kg less than the measured values. This correction 
is justified by the fact that over 95 % of the biotin in wheat 
is not available (Frigg, 1976; Whitehead et aL, 1982) 



Biotin content of plasma, yolk and albumen as a 
function of dietary biotin 

Figs. 1-3 show respectively the biotin levels in plasma, 
yolk and albumen before and after 1, 2 and 5 weeks on 
the experimental diets. Steady-state levels were achieved 
within 1 week in plasma and albumen and by 2 weeks in 
yolk. With respect to dietary biotin, the accumulation of 
biotin in plasma, yolk and albumen can be considered in 
three phases. In the normal range of dietary biotin 
(<250/*g/kg) there is a very strong relationship 
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Fig, 
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1 .0 2.0 3.0 

Biotin (mg/kg of feed) 
3. Effect of dietary biotin on the amount of biotin in chicken 
egg albumen 



Open circles (O) represent the control values for the 
treatment groups immediately before the experimental 
diets were begun. The dotted line represents the average of 
these control values (26.4±6.7 fig/kg). Closed circles (#) 
represent the average ±s.d. for duplicate samples taken 
after the birds had been 1, 2 and 5 weeks on the 
experimental diets. The right-hand scale is based on a 
recovered albumen weight of 36.6 g/egg. 
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80 



100 



20 40 60 

Biotin (//g/litre of plasma) 
Fig. 4. Concentration of biotin in the yolk (#) and albumen (O) 
of chicken eggs as a function of the concentration of 
biotin in the plasma of the laying hen 



-Each-point -represents -the-average-of-four-tosix-biotin- 
determinations made on plasma, yolk and albumen 
samples. Note the vertical scale is 25 times that of the 
horizontal scale. 



between dietary biotin and the biotin content of plasma 
and yolk, whereas the biotin content of albumen is rather 
low. Between 250 and 1000 fig/kg there is a plateau in the 
biotin contents of plasma and yolk and a large increase 
in the biotin content of albumen. Above 1000 fig of 
biotin/kg there is a further gradual increase in biotin in 
all three compartments. Considering the range from 30 
to 3475/tg/kg, an increase of over 100-fold in dietary 
biotin, there is a 2.5-fold increase in plasma biotin, a 



6.8-fold increase in yolk biotin and a 44.5-fold increase 
in albumen biotin. Hens fed the biotin-deficient diets for 
6 weeks continued to lay eggs. The plasma and yolk 
concentrations of biotin from these»birds were 4.05 /tg/j 
and 13.0/ig/kg respectively. The biotin content of 
albumen was below the detection limits of the assay 
These responses are broadly in agreement with the 
observations of Frigg et al. (1984). 

Biotin content of yolk and albumen as a function of 
plasma biotin 

The plasma distributes absorbed dietary biotin to the 
various tissues of the body, and thus biotin in the plasma 
is an intermediate between ingested biotin and biotin 
deposited in the yolk and albumen of eggs. Fig. 4 shows 
that biotin deposition in yolk and albumen is not a 
simple function of plasma biotin concentration. Although 
biotin is concentrated in yolk relative to plasma over the 
entire range of experimental conditions, the efficiency of 
transfer is greatly reduced when plasma biotin concentra- 
tions are below 20 jig/l. The yolk-to-plasma concentra- 
tion ratio in this region is less than 10: 1, whereas the 
incremental increase above this region is near 35: 1. The 
relationship between biotin in albumen and plasma 
shows an even more dramatic discontinuity. Below 
64/*g/l, very little biotin is deposited in albumen. A 
slight increase in plasma biotin above this level results in 
very large increases in the deposition of biotin in 
albumen. 

BBP-I and BBP-II content of plasma and yolk 

Fig. 5 shows that hens transferred from a diet 
containing 30 fig of biotin/kg of feed to one containing 
987 fig/kg increase the production of a heat-labile BBP 
appearing in both the plasma and yolk. The data in Fig. 
5 are not corrected for isotope dilution. If such a 
correction were made, there would be little difference in 
the total biotin binding above 60 °C, whereas difference 
in biotin-binding near 45 °C would be accentuated. 

The kinetics of biotin exchange at 45 and 65 °C in the 
same yolk samples are shown in Fig. 6. Again, the 
increase in the amount of heat-labile BBP-II is evident in 
the samples from hens fed high-biotin diets. There is very 
little difference in the amount of BBP-I in these two 
samples. 

Similar analyses conducted at 45 and 65 °C on yolk 
samples from hens fed a wide range of dietary biotin (Fig. 
7) show that maximal production of BBP-I occurs on 
diets containing 50 fig or more of biotin per kg of feed, 
whereas maximal production of BBP-II occurs with 
higher levels o f dietar y b i otin ( > 250 /tg/kg). The 



slightly lower amounts of BBP-I at higher dietary biotin 
levels are not considered signflcant at this time because 
the large amounts of endogenous biotin render the BBP-I 
assay less accurate in this region. 

Fig. 8 shows that there is an 8-10-fold excess of 
endogenous biotin over BBP-I in yolk from hens fed very 
high levels of biotin. The mean±s.D. of this ratio for 20 
assays performed on yolk samples from pre-experimental 
control hens maintained on 30 fig of biotin/kg of diet is 
1.64 + 0.20. If one assumes that the excess biotin in the 
samples is due to BBP-II bound biotin, the ratio of BBP-H 
to BBP-I increases from 0.64 at 30 fig of biotin/kg to 5 
at 500 /ig/kg. The biotin content of yolk is included in 
Fig. 8 as a reference to show that the data obtained by 
microbiological assay are qualitatively similar to those 
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(•) wereincubated for 30 min 

they had been maintained for 3 weeks on the «t»nlrH h™. i . ^mples from the same birds were analysed after 



obtained by isotope-exchange assays of the binding 
protein. Assays at 45 °C show that BBP-II is saturated 
with biotm even in yolk samples obtained from hens fed 
a biotin-deficient diet for 5 weeks. 

Content and fractional saturation of avidin in albumen 

- Table 2 shows that the concentration of avidin in 
albumen is unaffected by dietary biotin. The fractional 



0.6 



i 



o 
E 
a 



2.0 



o 



18 &k 




saturation is strongly dependent on dietary biotin 
Avidm is over 90% saturated with biotin when dietary 
biotin exceeds about 1 500 /*g/kg. 

BBPs in the plasma of immature and adult chickens of 
both sexes 

As noted in the Methods and materials section, the 
amounts of the various binding proteins cannot be 
measured with high accuracy when they are present in 
mixtures. Table 3 presents the analysis of BBP-I, BBP-II 
and avidin in the plasma of immature and adult male and 
female chickens. The identification is based on heat- 
stability. BBP-I is present in the plasma of immature 
chickens of both sexes at concentrations about one-tenth 
that found in laying hens. BBP-II was detected only in 
the plasma of laying hens. Avidin was present in 



'§ 0.5 



20 40 60 

Period of incubation (min) 
6. Kinetics of biotin exchange at 45 °C (O, A) and 65 °C 
A) show the accumulation of a heat-labile 
biotin-binding protein in egg yolks from hens fed a diet 
with 30/ig of biotin/kg of feed (O, •) and then fed a 
diet with 1000 /fg/kg for 3 weeks (A, A) 
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Fig. 7. Concentration of biotin-binding sites in chicken egg yolk 
determined by radioligand exchange at 45 °C (O) and 
65 °C (#) as a function of dietary biotin 
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The dotted reference line is the microbiologically deter- 
mined endogenous biotin content of the same samples 
from Fig. 2 scaled to coincide with the data points. 



Table 3. Estimation of the amounts of various biotin-binding 
proteins in the plasma of immature and adult chickens 

Values are expressed as the concentration of biotin-binding 
sites as determined by radioligand exchange assavs for 
40 min at 45 °C for BBP-II, 65 °C for BBP-I and 85 °C for 
avidin. 















9 of yo 








Concn. (nM) 




(mg/k 


Age 


Sex 


BBP-II 


BBP-I 


Avidin 


v» 
o 


1 day 


Male 


< 0.6 


- i.8 


< 1.0 


CO 




Female 


< 0.5 


- 2.3 


< 1.5 




2 weeks 


Male 


< 1.0 


~ 2.6 


< 1.3 




3 weeks 


Female 


< 0.5 


~ 1.5 


<0.5 




6 weeks 


Female 


< 0.8 


- 3.0 


< 0.8 




8 weeks 


Male 


< 1.0 


-4.2 


< 0.9 


1 


9 weeks 


Female 


< 0.8 


~ 2.7 


<0.7 




11 weeks 


Male 


< 0.6 


~ 2.3 


< 1.2 




Adult 


Female 


40-120 


25^0 


n.d* 






Male 


- 0 


< 2.9 


10.5 




* n.d., not determined. 









significant amounts only in the plasma of adult males. In 
all samples, the amount of biotin equals or exceeds the 
available binding sites. 



DISCUSSION 

Discovery of BBP-II 

As originally designed, there were two objectives in the 
present study. The first was to determine if biotin 
transport to the yolk was limited by BBP. The other was 
to determine if apo-BBP could be generated in vivo and 
be transported to yolk. At the time, only one BBP was 
known from yolk and it was stable at 65 °C. Our initial 
results at 65 °C (Fig. 8) with a new assay based on 
[ 3 H]biotin exchange produced the expected saturation 
profile, but unexpectedly revealed severalfold more 
biotin in yolk than biotin-binding sites. This paradox was 
resolved when a second, more abundant but heat-labile, 
BBP (BBP-II) was discovered. The discovery of BBP-II 
seems to be dependent in part on the new assay, which 
uses [ 3 H]biotin of high specific radioactivity and permits 
a 100-fold higher dilution of samples (L. Bush & 
H. B. White III, unpublished work). [ 14 C]Biotin-based 
analyses of egg yolk which should have contained BBP-II 
show no heat-labile component below 70 °C (White et aL, 



1976; Kulomaa et al., 1981). Furthermore, the earlier 
assay procedure had not been developed so that the 
presence of excess endogenous biotin could be detected. 
Although quantification of the amounts of BBP-I and 
BBP-II in mixtures is imprecise, the general patterns that 
have emerged are clear, as discussed below. 

Relationship of plasma biotin to yolk biotin 

The deposition of biotin in yolk is not the linear or 
saturable function of biotin in the plasma that would be 
expected for a simple receptor-mediated or diffusional 
process. Rather, biotin at low concentrations in plasma 
is transferred to yolk less efficiently than it is at higher 
concentrations (Fig. 4). This pattern can be qualitatively 
explained by the presence of two BBPs whose production 
is differentially dependent on dietary biotin and whose 
efficiencies of transfer to yolk are different. 

The plasma concentration of BBP-I is rather constant 
in laying hens unless dietary biotin is severely restricted, 
in which case the concentration of BBP-I is lower. The 
fact that BBP-I remains saturated with biotin even in 
biotin-deficient hens (Fig. 8) indicates that the synthesis, 
secretion or stability of this protein is dependent on 
biotin. The pattern for the production of BBP-II is 
similar, except that a greater response is observed and the 
response saturates at hig he r diet ary biotin. Thus over the 



Table 2. Biotin-binding capacity and biotin saturation of avidin in albumen from hens fed different amounts of biotin 



28 


100 


257 


557 


987 


1766 


3475 


5.37 


7.52 


5.34 


5.78 


5.68 


5.25 


5.15 




6.49 


7.60 


5.49 


6.15 


4.64 


4.39 




7.23 


5.64 


5.53 


6.60 


4.74 


4.88 


4.66 


5.17 


4.80 


3.86 


5.11 


4.17 


4.31 


6.3 


22.3 


54.3 


70.1 


81.5 


94.4 


97.2 



Dietary biotin (>g/kg) 
Biotin-binding capacity* 

(/ig/kg of albumen) 

Pretreatment 

Week 1 

Week 2 

Week 5 
Biotin saturation (% ) 

(average for weeks 1 and 2) 

assay ValUeS reported have been corrected for Iosses ^ to avidin binding to surfaces at the high dilutions used in the [ 3 H]biotin-based 
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normal dietary biotin range the production of BBP-II 
relative to BBP-I increases from less than 1 at 30 fig of 
biotin/kg of fe£d to perhaps-greater -than 5 at 250 /*g/kg. 
TTJie concentration ratio of BBP-II to BBP-I in yolk is 
gjreater than in plasma, indicating that BBP-II is 
deposited more efficiently in the yolk than is BBP-I. 
; We conclude that normal biotin deposition in egg is 
dependent on, and stoichiometric with, BBPs, as 
previously asserted (White, 1985). However, at very high 
dietary biotin levels there is increased biotin deposition 
in yolk in the absence of increased production of BBP. 
This additional biotin behaves as free biotin in our assay. 
The unexpectedly high efficiency with which this 'free' 
tyotin is transferred from plasma to yolk suggests that it 
riiay in fact be bound weakly to a second site on BBP-II 
; or be associated with lipids (Trager, 1948). 

Dietary biotin absorbed in excess of that necessary to 
. saturate BBP-I and BBP-II at maximal production 
t " should appear as free biotin; however, as indicated above, 
. - free' biotin is not detected in plasma or yolk until 
dietary biotin is considerably increased. This excess 
biotin in the plasma is scavenged very efficiently by 
avidin in the oviduct. Only when avidin becomes 
saturated are there significant increases in free biotin in 
nlasma and yolk (Figs. 1 and 2). 

Regulation of the production of BBP-I and BBP-II 

■j The synthesis of both BBP-I and BBP-II is stimulated 
during egg laying. This implies that the gene (or genes) 
for these two proteins is (are) induced by oestrogen, as 
has been suggested by Murty & Adiga (1985). The 
observation that the production of BBP-I and BBP-II 
, was also dependent on biotin availability was unexpected. 
Thus both sex hormones and a specific ligand regulate 
the production of both proteins. The mechanism of the 
Biotin effect is not known. Although transcriptional 
regulation is possible, a translational regulation can be 
envisaged in which ligand binding to the nascent protein 
iiiust occur before termination or secretion. A less 
efficient biotin-dependent mechanism could depend on 
the instability of the apoprotein. 

Mechanism of biotin deposition in the oocyte 

Ultrastmctural studies show that yolk deposition 
occurs via a very active clathrin-mediated endocytosis 
(Perry et aL, 1978, 1984; Griffin et aL y 1984). The 
Reposition of specific proteins such as vitellogenin and 
low-density lipoproteins has been shown to be receptor- 
mediated (Woods & Roth, 1984; Krummins & Roth, 
1981). Despite the presence of these receptors, the 
concentration of these proteins in yolk is only about six 
~timesl;feal^ is co~riceh" 

trated by this same factor and is thought to be deposited 
as a complex of its binding protein and the phosvitin 
moiety of vitellogenin (Fraser & Emtage, 1976). Similarly 
riboflavin is concentrated 6-fold in yolk (White et al. 9 
1976), yet no receptor has been detected for its binding 
protein (Benore-Parsons, 1986). Other plasma-derived 
yolk protdns, such as immunoglobulins, serum albumin 
and transferrin, are not concentrated in yolk (Schjeide 

al. 9 1976), even though receptors for immunoglobins 
have been reported. 

Biotin is concentrated by more than 20-fold in yolk 
relative to plasma. This ratio varies from about 3 when 
chickens are fed biotin-deficient diets to almost 30 when 
diets of high protein content are fed. In comparison with 



other yolk-to-plasma concentration ratios, these values 
are quite high and suggest that m a specific receptor- 
mediated transport system exists for BBP-II and perhaps 
BBP-I as well. 

Transfer of biotin-binding proteins from yolk to chick 
plasma 

Several yolk proteins are transferred to the plasma of 
the embryo. For instance, the immunity of the hen is 
transferred to the chick via immunoglobulins deposited 
in yolk (Loecken & Roth, 1983). Similarly, maternal 
serum transferrin, in pigeons at least, appears in the chick 
plasma (Frelinger, 1971). The presence of BBP-I in the 
plasma of newly hatched chicks (Table 3) suggests that 
there may be transfer of BBP-I from the yolk. Although 
this possibility cannot be ruled out, the fact that the 
concentration of BBP-I remains fairly constant during 
the rapid growth of a chick implies that most, if not all, 
BBP-I in older chicks at least is synthesized by the chick 
and not derived from the yolk. The apparent absence of 
BBP-II in chick plasma precludes significant yolk-to- 
plasma transfer. 

Relationship of BBP-I to BBP-II 

Although it is clear that yolk BBPs are distinct from 
egg-white avidin (Meslar et a!. 9 1978; Murthy & Adiga, 
1984), the structural and genetic relationships between 
BBP-I and BBP-II are yet to be determined. The fact that 
their concentrations show different dependences on 
dietary biotin, that their distribution differs between hen 
and chick plasma, and that their distributions between 
soluble and particulate fractions of diluted egg yolk differ 
(results not shown) suggest that two different gene 
products are present. However, they could be differently 
modified products of the same gene. Furthermore, there 
is the possibility that both are tetrameric isoproteins 
which can generate hybrid intermediate forms that 
associate with a receptor or other proteins with different 
affinities. 

Whatever the structural and genetic relationships are 
between BBP-I and BBP-II, there seems to be a 
functional differentiation of the two proteins. BBP-I is 
present in the plasma of immature chickens. Despite its 
increased concentration in the plasma of laying hens 
relative to chick plasma, it is transferred to yolk less 
efficiently than is BBP-II. BBP-II, on the other hand, is 
detected only in the plasma of laying hens and is 
normally present at higher concentrations than BBP-I. 
These patterns suggest that BBP-I has primarily a 
maintenance function and secondarily it serves to 
transport biotin to yolk. The primary and perhaps sole 
"function of BBP-II seems"to~belhe transporf of biotin to 
the yolk. 

Function of avidin 

Avidin has been viewed as one of the many 
antimicrobial proteins of egg albumen (Tranter & Board, 
1982). This view is further supported by the induction of 
avidin synthesis at the site of tissue injury in chickens 
(Elo & Korpela, 1984). The absence of significant 
amounts of biotin in egg albumen suggest that avidin has 
little role in the biotin nutrition of the embryo. Our 
results are consistent with a non-nutritive role for avidin 
and show that biotin deposition in albumen occurs only 
when chickens are fed on diets that have biotin contents 
well in excess of those of natural foods. We also observed 
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a saturated BBP with the stability of avidin in the plasma 
of adult males (Table 1), confirming the earlier 
observations of EIo et at (1979). Perhaps avidin does 
have a role in biotin metabolism in cockerels. 

Comparison of biotin and riboflavin transport to the 
chicken egg yolk 

Parallel studies on the binding-protein-mediated 
deposition of riboflavin to the oocyte (White et ai., 1986) 
show a different and much simpler pattern than 
described here for biotin. Riboflavin availability does not 
regulate the production of riboflavin-binding protein. 
Apoprotin is produced and deposited in yolk. From this 
sample of two vitamins, it is clear that generalizations 
about the mechanism of protein-mediated vitamin 
transport to the oocyte may be hard to find. 

We thank Mr. John Armstrong, Mr Christine Murnin and 
Ms. Kate Heron for their expert technical assistance. H.B. W. 
was supported by National Institutes of Health grants AM 
27873 and AM 34445 and by the AFRC-Underwood Fund 
during a sabbatical leave spent at the Poultry Research Centre. 
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